I'm trying to extract the value of a node from a pom.xml:
<?xml version="1.0" encoding="UTF-8"?> <project> <parent> <groupId>org.me.labs</groupId> <artifactId>my-random-project</artifactId> <version>1.5.0</version> </parent> ... </project>
I need to extract the artifactId and version from the XML using a shell command. I have the following requirements/observations:
I have tried the following:
xpath
works on my Mac, but isn't available by default on RHEL machines. Similarly for xmllint --xpath
, which I guess is only available on later versions of xmllint
, which I don't have and can't enforce.xmllint --pattern
seemed promising, but I can't seem to get an output out of xmllint --pattern '//project/parent/version' pom.xml
(prints entire XML) or xmllint --stream --pattern '//project/parent/version' pom.xml
(no output).I realize this is a common question here on SO, but the points above are why I can't use those answers. TIA for your help.
--format
is used only to format (indent, etc) the document. You can do that using --xpath
(tested in Ubuntu, libxml v20900):
$ xmllint --xpath "//project/parent/version/text()" pom.xml 1.5.0
I've managed to solve it for the time being with this rather unwiedly script using xmllint --shell
.
echo "cat //project/parent/version" | xmllint --shell pom.xml | sed '/^\/ >/d' | sed 's/<[^>]*.//g'
If the XML nodes have namespace attributes like my pom.xml had, things get heavier, basically extracting the node by name:
echo "cat //*[local-name()='project']/*[local-name()='parent']/*[local-name()='version']" | xmllint --shell pom.xml | sed '/^\/ >/d' | sed 's/<[^>]*.//g'
Hope it helps. If anyone can simply these expressions, I'd be grateful.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With