Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

xmllint failing to properly query with xpath

Tags:

xml

xpath

xmllint

I'm trying to query an xml file generated by adium. xmlwf says that it's well formed. By using xmllint's debug option i get the following:

$ xmllint --debug doc.xml
DOCUMENT
version=1.0
encoding=UTF-8
URL=doc.xml
standalone=true
  ELEMENT chat
    default namespace href=http://purl.org/net/ulf/ns/0.4-02
    ATTRIBUTE account
      TEXT
        [email protected]
    ATTRIBUTE service
      TEXT compact
        content=MSN
    TEXT compact
      content= 
    ELEMENT event
      ATTRIBUTE type

Everything seems to parse just fine. However, when I try to query even the simplest things, I don't get anything:

$ xmllint --xpath '/chat' doc.xml 
XPath set is empty

What's happening? Running that exact same query using xpath returns the correct results (however with no newline between results). Am I doing something wrong or is xmllint just not working properly?

Here's a shorter, anonymized version of the xml that shows the same behavior:

<?xml version="1.0" encoding="UTF-8" ?>
<chat xmlns="http://purl.org/net/ulf/ns/0.4-02" account="[email protected]" service="MSN">
<event type="windowOpened" sender="[email protected]" time="2011-11-22T00:34:43-03:00"></event>
<message sender="[email protected]" time="2011-11-22T00:34:43-03:00" alias="foo"><div><span style="color: #000000; font-family: Helvetica; font-size: 12pt;">hi</span></div></message>
</chat>
like image 873
ailnlv Avatar asked Nov 25 '11 01:11

ailnlv


2 Answers

I don't use xmllint, but I think the reason your XPath isn't working is because your doc.xml file is using a default namespace (http://purl.org/net/ulf/ns/0.4-02).

From what I can see, you have 2 options.

A. Use xmllint in shell mode and declare the namespace with a prefix. You can then use that prefix in your XPath.

    xmllint --shell doc.xml
    / > setns x=http://purl.org/net/ulf/ns/0.4-02
    / > xpath /x:chat

B. Use local-name() to match element names.

    xmllint --xpath /*[local-name()='chat']

You may also want to use namespace-uri()='http://purl.org/net/ulf/ns/0.4-02' along with local-name() so you are sure to return exactly what you are intending to return.

like image 191
Daniel Haley Avatar answered Nov 18 '22 11:11

Daniel Haley


I realize this question is very old now, but in case it helps someone...

Had the same problem and it was due to the XML having a namespace (and sometimes it was duplicated in various places in the XML). Found it easiest to just remove the namespace before using xmllint:

sed -e 's/xmlns="[^"]*"//g' file.xml | xmllint --xpath "..." -

In my case the XML was UTF-16 so I had to convert to UTF-8 first (for sed):

iconv -f utf16 -t utf8 file.xml | sed -e 's/encoding="UTF-16"?>/encoding="UTF-8"?>/' | sed -e 's/xmlns="[^"]*"//g' | xmllint --xpath "..." -
like image 13
codesniffer Avatar answered Nov 18 '22 12:11

codesniffer