Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to execute XPath one-liners from shell?

You should try these tools :

  • xmlstarlet : can edit, select, transform... Not installed by default, xpath1
  • xmllint : often installed by default with libxml2-utils, xpath1 (check my wrapper to have --xpath switch on very old releases and newlines delimited output (v < 2.9.9)
  • xpath : installed via perl's module XML::XPath, xpath1
  • xml_grep : installed via perl's module XML::Twig, xpath1 (limited xpath usage)
  • xidel: xpath3
  • saxon-lint : my own project, wrapper over @Michael Kay's Saxon-HE Java library, xpath3

xmllint comes with libxml2-utils (can be used as interactive shell with the --shell switch)

xmlstarlet is xmlstarlet.

xpath comes with perl's module XML::Xpath

xml_grep comes with perl's module XML::Twig

xidel is xidel

saxon-lint using SaxonHE 9.6 ,XPath 3.x (+retro compatibility)

Ex :

xmllint --xpath '//element/@attribute' file.xml
xmlstarlet sel -t -v "//element/@attribute" file.xml
xpath -q -e '//element/@attribute' file.xml
xidel -se '//element/@attribute' file.xml
saxon-lint --xpath '//element/@attribute' file.xml
  • xmlstarlet page
  • man xmllint
  • xpath page
  • xml_grep
  • xidel
  • saxon-lint

.


You can also try my Xidel. It is not in a package in the repository, but you can just download it from the webpage (it has no dependencies).

It has simple syntax for this task:

xidel filename.xml -e '//element/@attribute' 

And it is one of the rare of these tools that supports XPath 2.


One package that is very likely to be installed on a system already is python-lxml. If so, this is possible without installing any extra package:

python -c "from lxml.etree import parse; from sys import stdin; print('\n'.join(parse(stdin).xpath('//element/@attribute')))"

In my search to query maven pom.xml files I ran accross this question. However I had the following limitations:

  • must run cross-platform.
  • must exist on all major linux distributions without any additional module installation
  • must handle complex xml-files such as maven pom.xml files
  • simple syntax

I have tried many of the above without success:

  • python lxml.etree is not part of the standard python distribution
  • xml.etree is but does not handle complex maven pom.xml files well, have not digged deep enough
  • python xml.etree does not handle maven pom.xml files for unknown reason
  • xmllint does not work either, core dumps often on ubuntu 12.04 "xmllint: using libxml version 20708"

The solution that I have come across that is stable, short and work on many platforms and that is mature is the rexml lib builtin in ruby:

ruby -r rexml/document -e 'include REXML; 
     puts XPath.first(Document.new($stdin), "/project/version/text()")' < pom.xml

What inspired me to find this one was the following articles:

  • Ruby/XML, XSLT and XPath Tutorial
  • IBM: Ruby on Rails and XML