Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the fastest XML Parser available for Delphi?

We have reasonably large XML strings which we currently parse using MSXML2

I have just tried using MSXML6 hoping for a speed improvement and have got nothing!

We currently create a lot of DOM Documents and I guess there may be some overhead in constantly interacting with the MSXML2/6 dll

Does anyone know of a better/faster XML component for Delphi?

If anyone can suggest an alternative, and it is faster, we would look to integrate it, but that would be a lot of work, so hopefully the structure would not be too different to that used by MSXML

We are using Delphi 2010

Paul

like image 431
Paul Avatar asked Feb 28 '12 19:02

Paul


People also ask

Which XML parser is best in Java for large files?

Although I agree that StAX is usually the best solution, there are situations in which SAX is better. If you have documents that contain large blocks of Text content, then AFAIR the StAX API will read those blocks of Text in memory entirely and handle that as a single event.


2 Answers

some time ago I had to serialize record to XML format; for ex:

 TTest = record     a : integer;     b : real;   end; 

to

     <Data>         <a type="tkInteger">value</a>         <b type="tkFloat">value</b>     </Data>

I used RTTI to recursively navigate through record fields and storing values to XML. I've tried few XML Parsers. I did't need DOM model to create xml, but needed it to load it back.

XML contained about 310k nodes (10-15MBytes); results presented in table below, there are 6 columns with time in seconds;
1 - time for creating nodes and write values
2 - SaveToFile();
3 = 1 + 2
4 - LoadFromFile();
5 - navigate through nodes and read values
6 = 4 + 5
enter image description here

MSXML/Xerces/ADOM - are differend vendors for TXMLDocument (DOMVendor)
JanXML doesn't work with unicode; I fixed some errors, and saved XML, but loading causes AV (or stack overflow, I don't remember);
manual - means manually writing XML using TStringStream.

I used Delphi2010, Win7x32, Q8200 CPU/2.3GHz, 4Gb of RAM.

update: You can download source code for this test (record serialization to XML using RTTI) here http://blog.karelia.pro/teran/files/2012/03/XMLTest.zip All parsers (Omni, Native, Jan) are included (now nodes count in XML is about 270k), sorry there are no comments in code.

like image 50
teran Avatar answered Sep 23 '22 12:09

teran


I know that it's an old question, but people might find it interesting:

I wrote a new XML library for Delphi (OXml): http://www.kluug.net/oxml.php

It features direct XML handling (read+write), SAX parser, DOM and a sequential DOM parser. One of the benefits is that OXml supports Delphi 6-Delphi XE5, FPC/Lazarus and C++Builder on all platforms (Win, MacOSX, Linux, iOS, Android).

OXml DOM is record/pointer based and offers better performance than any other XML library:

The read test returns the time the parser needs to read a custom XML DOM from a file (column "load") and to write node values to a constant dummy function (column "navigate"). The file is encoded in UTF-8 and it's size is about 5,6 MB.

XML parse comparison

The write test returns the time the parser needs to create a DOM (column "create") and write this DOM to a file (column "save"). The file is encoded in UTF-8 and it's size is about 11 MB.

XML write comparison

+ The poor OmniXML (original) writing performance was the result of the fact that OmniXML didn't use buffering for writing. Thus writing to a TFileStream was very slow. I updated OmniXML and added buffering support. You can get the latest OmniXML code from the SVN.

like image 38
oxo Avatar answered Sep 24 '22 12:09

oxo