I'm trying to parse a large file (> 2GB) of structured markup data and the memory is not enough for this.Which is the optimal way of XML parsing class for this condition.More details please.
There are two ways to parse the file using 'ElementTree' module. The first is by using the parse() function and the second is fromstring() function. The parse () function parses XML document which is supplied as a file whereas, fromstring parses XML when supplied as a string i.e within triple quotes.
Python enables you to parse and modify XML documents. In order to parse XML document, you need to have the entire XML document in memory. In this tutorial, we will see how we can use XML minidom class in Python to load and parse XML files.
Check out the iterparse()
function. A description of how you can use it to parse very large documents can be found here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With