Question

I'm trying to see how libxml implements XPath support, so it made sense to me to test using xmllint. However, the obvious option, --pattern, is somewhat obscure, and I ended up using something like the following:

test.xml: <foo><bar/><bar/></foo>

> xmllint --shell test.xml
/  > dir /foo
ELEMENT foo
/  > dir /foo/*
ELEMENT bar
ELEMENT bar

This seems to work, and that's great, but I'm still curious. What is xmllint's --pattern option for, and how does it work?

Provide an example for full credit. =)

Was it helpful?

Solution

The hint is in the words "which can be used with the reader interface to the parser": xmllint only uses the reader interface when passed the --stream option:

$ xmllint --stream --pattern /foo/bar test.xml
Node /foo/bar[1] matches pattern /foo/bar
Node /foo/bar matches pattern /foo/bar

OTHER TIPS

The seemingly undocumented option --xpath seems to be more useful.

% cat data.xml
<project>
  <name>
    bob
  </name>
  <version>
    1.1.1
  </version>
</project>
% xmllint --xpath '/project/version/text()' data.xml | xargs -i echo -n "{}"
1.1.1
% xmllint --xpath '/project/name/text()' data.xml | xargs -i echo -n "{}"
bob

From the xmllint(1) man page:

   --pattern PATTERNVALUE
          Used to exercise the pattern recognition engine, which can be
          used with the reader interface to the parser. It allows to
          select some nodes in the document based on an XPath (subset)
          expression. Used for debugging.

It only understands a subset of XPath and its intention is to aid debugging. The library that does understand XPath fully is libxslt(3) and its command-line tool xsltproc(1).

The ``pattern'' module in libxml "allows to compile and test pattern expressions for nodes either in a tree or based on a parser state" and its documentation lives here: http://xmlsoft.org/html/libxml-pattern.html

Ari.

If you simply want the text value of a number of xml nodes then you could use something like this (if --xpath is not available on your version of xmllint):

./foo.xml:

<hello>
   <world>its alive!!</world>
   <world>and works!!</world>
</hello>

$ xmllint --stream --pattern /hello/world --debug ./foo.xml | grep -A 1 "matches pattern" | grep "#text" | sed 's/.* [0-9] //'
its alive!!
and works!!
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top