Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing a xml.gz file in python

Tags:

python

xml

gzip

tar

I have an tar.gz file on my local machine called abc.aXML.gz, which contains many XML files. I want to find some data from these files but don't know how to parse these file using Elementtree and gzip.

import xml.etree.ElementTree as ET
import gzip
document = ET.parse(gzip("abc.aXML.gz"))
root = document.getroot()
like image 613
shahbaz khan Avatar asked Oct 26 '15 13:10

shahbaz khan


People also ask

How do I open a .GZ file in XML?

Launch WinZip from your start menu or Desktop shortcut. Open the compressed file by clicking File > Open. If your system has the compressed file extension associated with WinZip program, just double-click on the file.

How do I view a .GZ file in Python?

To open a compressed file in text mode, use open() (or wrap your GzipFile with an io. TextIOWrapper ).


1 Answers

Below Code worked for me, to read and process a zipped xml file.
I have used gzip first to unzip the file and then used ElementTree.

import gzip
import xml.etree.ElementTree as ET

input = gzip.open('input-xml.gz', 'r')
tree = ET.parse(input)
root = tree.getroot()

print root.tag
print root.attrib
like image 192
upkar Avatar answered Oct 18 '22 10:10

upkar