Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XMLStarlet does not select anything

I have a typical pom.xml, and want to print the groupId, artifactId and version, separated by colon. I think that XMLStarlet is the right tool for that. I tried several ways, but I always get an empty line.

xml sel -t -m project -v groupId -o : -v artifactId -o : -v version pom.xml

Expected output:

org.something.apps:app-acct:5.4

Real output: empty line

Even if I try to print just the groupId I get nothing:

xml sel -t -v project/groupId pom.xml

I am sure that the tool sees the elements because I can list them without problem:

xml el pom.xml

prints the following (correctly):

project
project/modelVersion
project/parent
project/parent/groupId
project/parent/artifactId
project/parent/version
project/groupId
project/artifactId
project/version
project/packaging

What's wrong?

Here is the cut-down version of pom.xml:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
                        http://maven.apache.org/maven-v4_0_0.xsd">

    <modelVersion>4.0.0</modelVersion>

    <parent>
        <groupId>org.something</groupId>
        <artifactId>base</artifactId>
        <version>1.16</version>
    </parent>

    <groupId>org.something.apps</groupId>
    <artifactId>app-acct</artifactId>
    <version>5.4</version>
    <packaging>war</packaging>

</project>
like image 426
uk4sx Avatar asked Jan 26 '12 21:01

uk4sx


3 Answers

UPDATE: since version 1.5 the default namespace prefix '_' is available so the solution is reduced to this:

xml sel -t -m _:project -v _:groupId -o : -v _:artifactId -o : -v _:version pom.xml

Thanks @JamieNelson for the heads-up.


Unfortunately, XMLStarlet is very picky about the default namespace. If the document has it declared (xmlns=), you have to declare it for XMLStarlet too, and prefix the elements with the name you have chosen (see here):

xml sel -N my=http://maven.apache.org/POM/4.0.0 -t -m my:project -v my:groupId -o : -v my:artifactId -o : -v my:version pom.xml

Running the above command gives the expected output:

org.something.apps:app-acct:5.4

However, if the document does NOT have the default namespace declared (or the namespace has a slightly different URL), the above command will NOT work, which is a real PITA. A more universal solution is to remove the default namespace declaration before selecting the elements. As of XMLStarlet 1.3.1, converting the XML to PYX format and back removes the namespace declarations:

xml pyx pom.xml | xml p2x | xml sel -t -m project -v groupId -o : -v artifactId -o : -v version 2>nul

UPDATE (2014-02-12): as of XMLStarlet 1.4.2 the PYX <-> XML conversion is fixed (does not remove namespace declarations), so the above command will NOT work (thanks for Peter Gluck for the tip). Use the following command instead:

xml pyx pom.xml | grep -v ^A | xml p2x | xml sel -t -m project -v groupId -o : -v artifactId -o : -v version

Note: the grep above removes ALL attributes from the document, not just namespace declarations. For this specific case (selecting element values from pom.xml where elements with non-default namespaces are not expected) it is Ok, but for a general XML you would remove just the default namespace declaration(s) and nothing else:

xml pyx pom.xml | grep -v "^Axmlns " | xml p2x | xml sel -t -m project -v groupId -o : -v artifactId -o : -v version


Note (obsolete): the error redirection (2>nul) is necessary to hide the complaint about the (now) unknown namespace xsi:

-:1.28: Namespace prefix xsi for schemaLocation on project is not defined

Another way of getting rid of the complaint is to remove the schemaLocation attribute (actually, this command removes all attributes from the PYX document, not just xsi:schemaLocation):

xml pyx pom.xml | grep -v ^A | xml p2x | xml sel -t -m project -v groupId -o : -v artifactId -o : -v version

like image 140
uk4sx Avatar answered Nov 13 '22 14:11

uk4sx


The XML-> PYX -> XML trick did not work for me (using XMLStarlet version 1.4.2). However, the XMLStarlet documentation contains this handy sed command that removes namespace declarations from an XML document:

sed -e 's/ xmlns.*=".*"//g'

That worked. For the original question, the syntax would be:

cat pom.xml | sed -e 's/ xmlns.*=".*"//g' | xml sel -t -m project -v groupId -o : -v artifactId -o : -v version
like image 21
Peter Gluck Avatar answered Nov 13 '22 13:11

Peter Gluck


Since version 1.2 of xmlstarlet, you can just do this:

xml sel -t -m "//_:project" -v _:groupId -o : -v _:artifactId -o : -v _:version pom.xml

With a few other options here too: http://xmlstar.sourceforge.net/doc/UG/ch05.html

like image 4
Jamie Nelson Avatar answered Nov 13 '22 13:11

Jamie Nelson