I'm trying to parse an XML page I'm downloading from the web.
import requests
url = "http://www.w3schools.com/xml/cd_catalog.xml"
XML = requests.get(url)
print XML.content
tree = ET.ElementTree(XML)
root = tree.getroot()
print root.tag, root.attrib
I get one of two errors when I try and do this
for the above example webpage AttributeError: 'Response' object has no attribute 'tag'
And for the actualy XML site I'm looking at AttributeError: 'str' object has no attribute 'tag'
However if I just copy and paste the XML I've downloaded into a .xml file and open it works fine with no errors. Would anyone know how to fix these problems..?
You need to parse the response body, not the response object:
root = ET.fromstring(XML.content) # no .getroot() call required
or pass in a file object:
XML = requests.get(url, stream=True)
tree = ET.parse(XML.raw)
root = tree.getroot()
The latter could fail if the stream is compressed; the raw file object does not decompress this for you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With