How to parse XML file in chunks

Question

I have a very large XML file with 40,000 tag elements. When i am using element tree to parse this file it's giving errors due to memory. So is there any module in python that can read the xml file in data chunks without loading the entire xml into memory?And How i can implement that module?

zeekay · Accepted Answer

Probably the best library for working with XML in Python is lxml, in this case you should be interested in iterparse/iterwalk.

Michael Dillon · Answer

This is a problem that people usually solve using sax.

If your huge file is basically a bunch of XML documents aggregated inside and overall XML envelope, then I would suggest using sax (or plain string parsing) to break it up into a series of individual documents that you can then process using lxml.etree.

How to parse XML file in chunks

Tags:

python

Kratos85

2 Answers

zeekay

Michael Dillon

Recent Activity

Donate For Us

How to parse XML file in chunks

Tags:

python

Kratos85

2 Answers

zeekay

Michael Dillon

Related questions

Recent Activity

Donate For Us