Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest possible XML handling in Delphi for very large documents

I need recommendations on what to use in Delphi (I use Delphi 2009) to handle very large XML files (e.g. 100 MB) as fast as possible.

I need to input the XML, access and update the data in it from my program, and then export the modified XML again.

Hopefully the input and output could be done within a few seconds on a fast Windows machine.


Clarification. I expect I will need to use DOM, because access to the data structure for developing reports and making updates to the data is important, and I need this functionality to be very fast.

The input is only done once for File Loading and the output done only for File saving, usually just once upon exit. These should be quick as well, but are not as important as the in-memory data access and update.

My understanding is that 3rd party parsers only help with input and output, but not on using and modifying the data once loaded into memory. Or am I mistaken on this?

like image 734
lkessler Avatar asked Nov 05 '08 00:11

lkessler


People also ask

Which XML parser is faster?

DOM Parser is faster than SAX Parser. Best for the larger sizes of files. Best for the smaller size of files. It is suitable for making XML files in Java.

How do XML files work?

An XML file is an extensible markup language file, and it is used to structure data for storage and transport. In an XML file, there are both tags and text. The tags provide the structure to the data. The text in the file that you wish to store is surrounded by these tags, which adhere to specific syntax guidelines.


1 Answers

If I understood your question correctly, you have known data structure and you are modifying data - not XML structure of file.

Under these condition and if performance is crucial, then you could try with direct text manipulation - skip XML parsing.

Read from stream, use some fast text search algorithm e.g. Boyer-Moore, to find places where you need to modify data, do your modification and output data into another stream.

This would be one-pass, no XML parsing, no in-memory XML tree building.

like image 101
zendar Avatar answered Oct 06 '22 01:10

zendar