Possible Duplicate:
Converting XML to JSON using Python?
I'm doing some work on App Engine and I need to convert an XML document being retrieved from a remote server into an equivalent JSON object.
I'm using xml.dom.minidom
to parse the XML data being returned by urlfetch
. I'm also trying to use django.utils.simplejson
to convert the parsed XML document into JSON. I'm completely at a loss as to how to hook the two together. Below is the code I'm tinkering with:
from xml.dom import minidom from django.utils import simplejson as json #pseudo code that returns actual xml data as a string from remote server. result = urlfetch.fetch(url,'','get'); dom = minidom.parseString(result.content) json = simplejson.load(dom) self.response.out.write(json)
The code to convert XML to JSON is quite simple, just two lines. The xmltodict. parse() method will convert the XML to a python object that can then be converted to JSON.
To convert an XML document to JSON, follow these steps: Select the XML to JSON action from the Tools > JSON Tools menu. Choose or enter the Input URL of the XML document. Choose the path of the Output file that will contain the resulting JSON document.
xmltodict (full disclosure: I wrote it) can help you convert your XML to a dict+list+string structure, following this "standard". It is Expat-based, so it's very fast and doesn't need to load the whole XML tree in memory.
Once you have that data structure, you can serialize it to JSON:
import xmltodict, json o = xmltodict.parse('<e> <a>text</a> <a>text</a> </e>') json.dumps(o) # '{"e": {"a": ["text", "text"]}}'
Soviut's advice for lxml objectify is good. With a specially subclassed simplejson, you can turn an lxml objectify result into json.
import simplejson as json import lxml class objectJSONEncoder(json.JSONEncoder): """A specialized JSON encoder that can handle simple lxml objectify types >>> from lxml import objectify >>> obj = objectify.fromstring("<Book><price>1.50</price><author>W. Shakespeare</author></Book>") >>> objectJSONEncoder().encode(obj) '{"price": 1.5, "author": "W. Shakespeare"}' """ def default(self,o): if isinstance(o, lxml.objectify.IntElement): return int(o) if isinstance(o, lxml.objectify.NumberElement) or isinstance(o, lxml.objectify.FloatElement): return float(o) if isinstance(o, lxml.objectify.ObjectifiedDataElement): return str(o) if hasattr(o, '__dict__'): #For objects with a __dict__, return the encoding of the __dict__ return o.__dict__ return json.JSONEncoder.default(self, o)
See the docstring for example of usage, essentially you pass the result of lxml objectify
to the encode method of an instance of objectJSONEncoder
Note that Koen's point is very valid here, the solution above only works for simply nested xml and doesn't include the name of root elements. This could be fixed.
I've included this class in a gist here: http://gist.github.com/345559
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With