Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parse XML from URL into python object

The goodreads website has this API for accessing a user's 'shelves:' https://www.goodreads.com/review/list/20990068.xml?key=nGvCqaQ6tn9w4HNpW8kquw&v=2&shelf=toread

It returns XML. I'm trying to create a django project that shows books on a shelf from this API. I'm looking to find out how (or if there is a better way than) to write my view so I can pass an object to my template. Currently, this is what I'm doing:

import urllib2

def homepage(request):
    file = urllib2.urlopen('https://www.goodreads.com/review/list/20990068.xml?key=nGvCqaQ6tn9w4HNpW8kquw&v=2&shelf=toread')
    data = file.read()
    file.close()
    dom = parseString(data)

I'm not entirely sure how to manipulate this object if I'm doing this correctly. I'm following this tutorial.

like image 758
smilebomb Avatar asked Jun 09 '14 16:06

smilebomb


People also ask

Which XML parser is best in Python?

If parsing speed is a key factor for you, consider using cElementTree or lxml.

How do you process XML in Python?

To read an XML file using ElementTree, firstly, we import the ElementTree class found inside xml library, under the name ET (common convension). Then passed the filename of the xml file to the ElementTree. parse() method, to enable parsing of our xml file. Then got the root (parent tag) of our xml file using getroot().


2 Answers

I'd use xmltodict to make a python dictionary out of the XML data structure and pass this dictionary to the template inside the context:

import urllib2
import xmltodict

def homepage(request):
    file = urllib2.urlopen('https://www.goodreads.com/review/list/20990068.xml?key=nGvCqaQ6tn9w4HNpW8kquw&v=2&shelf=toread')
    data = file.read()
    file.close()

    data = xmltodict.parse(data)
    return render_to_response('my_template.html', {'data': data})
like image 165
alecxe Avatar answered Sep 21 '22 17:09

alecxe


xmltodict using urllib3

import traceback
import urllib3
import xmltodict

def getxml():
    url = "https://yoursite/your.xml"

    http = urllib3.PoolManager()

    response = http.request('GET', url)
    try:
        data = xmltodict.parse(response.data)
    except:
        print("Failed to parse xml from response (%s)" % traceback.format_exc())
    return data
like image 10
gies0r Avatar answered Sep 25 '22 17:09

gies0r