Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

xml.parsers.expat.ExpatError on parsing XML

Tags:

python

xml

I am trying to parse XML with Python but not getting very far. I think it's due to wrong XML tree this API returns.

So this is what is returned by the GET request:

<codigo>3</codigo><valor></valor><operador>Dummy</operador>

The GET request goes here:

http://69.36.9.147:8090/clientes/SMS_API_OUT.jsp?codigo=ABCDEFGH&cliente=XX

This is the Python code I am using without any luck:

import urllib
from xml.dom import minidom

url = urllib.urlopen('http://69.36.9.147:8090/clientes/SMS_API_OUT.jsp?codigo=ABCDEFGH&cliente=XX')
xml = minidom.parse(url)
code = doc.getElementsByTagName('codigo')

print code[0].data

And this is the response I get:

xml.parsers.expat.ExpatError: junk after document element: line 1, column 18

What I need to do is retrieve the value inside the <codigo> element and place it in a variable (same for the others).

like image 880
mistero Avatar asked Jul 16 '09 22:07

mistero


People also ask

What is ExpatError?

ExpatError is the type of exception raised when expat reports an error. expat is the Python Standard Library's XML parsing module. minidom , Python's minimal implementation of the Document Object Model interface, uses expat internally to parse the XML input when minidom.

What is parsing XML?

Definition. XML parsing is the process of reading an XML document and providing an interface to the user application for accessing the document. An XML parser is a software apparatus that accomplishes such tasks.


2 Answers

The main problem here is that the XML code being returned by that service doesn't include a root node, which is invalid. I fixed this by simply wrapping the output in a <root> node.

import urllib
from xml.etree import ElementTree

url = 'http://69.36.9.147:8090/clientes/SMS_API_OUT.jsp?codigo=ABCDEFGH&cliente=XX'
xmldata = '<root>' + urllib.urlopen(url).read() + '</root>'
tree = ElementTree.fromstring(xmldata)
codigo = tree.find('codigo').text

print codigo

You can use whatever parser you wish, but here I used ElementTree to get the value.

like image 131
sixthgear Avatar answered Oct 02 '22 21:10

sixthgear


An XML document consists of one top level document element, and then multiple subelements. Your XML fragment contains multiple top level elements, which is not permitted by the XML standard.

Try returning something like:

<result><codigo>3</codigo><valor></valor><operador>Dummy</operador></result>

I have wrapped the entire response in a <result> tag.

like image 37
Greg Hewgill Avatar answered Oct 02 '22 20:10

Greg Hewgill