Does anyone know of a memory efficient way to generate very large xml files (e.g. 100-500 MiB) in Python?
I've been utilizing lxml, but memory usage is through the roof.
Creating XML Document using Python First, we import minidom for using xml. dom . Then we create the root element and append it to the XML. After that creating a child product of parent namely Geeks for Geeks.
There is no limit of XML file size but it takes memory (RAM) as file size of XML file, so long XML file parsing size is performance hit.
Perhaps you could use a templating engine instead of generating/building the xml yourself?
Genshi for example is xml-based and supports streaming output. A very basic example:
from genshi.template import MarkupTemplate
tpl_xml = '''
<doc xmlns:py="http://genshi.edgewall.org/">
<p py:for="i in data">${i}</p>
</doc>
'''
tpl = MarkupTemplate(tpl_xml)
stream = tpl.generate(data=xrange(10000000))
with open('output.xml', 'w') as f:
stream.render(out=f)
It might take a while, but memory usage remains low.
The same example for the Mako templating engine (not "natively" xml), but a lot faster:
from mako.template import Template
from mako.runtime import Context
tpl_xml = '''
<doc>
% for i in data:
<p>${i}</p>
% endfor
</doc>
'''
tpl = Template(tpl_xml)
with open('output.xml', 'w') as f:
ctx = Context(f, data=xrange(10000000))
tpl.render_context(ctx)
The last example ran on my laptop for about 20 seconds, producing a (admittedly very simple) 151 MB xml file, no memory problems at all. (according to Windows task manager it remained constant at about 10MB)
Depending on your needs, this might be a friendlier and faster way of generating xml than using SAX etc... Check out the docs to see what you can do with these engines (there are others as well, I just picked out these two as examples)
The only sane way to generate so large an XML file is line by line, which means printing while running a state machine, and lots of testing.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With