Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Most efficient data structure to store an XML tree in C++

Tags:

c++

xml

qt

I'm doing some work with XML in C++, and I would like to know what the best data structure to store XML data is. Please don't just tell me what you've heard of in the past; I'd like to know what the most efficient structure is. I would like to be able to store any arbitrary XML tree (assuming it is valid), with minimal memory overhead and lookup time.

My initial thought was a hash, but I couldn't figure out how to handle multiple children of the same tag, as well as how attributes would be handled.

Qt solutions are acceptable, but I'm more concerned with the overall structure than the specific library. Thanks for your input.

like image 582
Zach Rattner Avatar asked Apr 17 '11 04:04

Zach Rattner


People also ask

Where should I store XML files?

XML documents you insert into columns of type XML can reside either in the default storage object, or directly in the base table row. Base table row storage is under your control and is available only for small documents; larger documents are always stored in the default storage object.

Which data structure is best for storing large data?

Which among the following data structures is best suited for storing very large numbers (numbers that cannot be stored in long long int). Following are the operations needed for these large numbers. Explanation: The only two choices that make sense are Array and Linked List.

What is XML tree structure with a suitable example?

XML documents are formed as element trees. An XML tree starts at a root element and branches from the root to child elements. The terms parent, child, and sibling are used to describe the relationships between elements.

Is XML helpful for structuring data?

XML is for structuring data XML makes it easy for a computer to generate data, read data, and ensure that the data structure is unambiguous. XML avoids common pitfalls in language design: it is extensible, platform-independent, and it supports internationalization and localization. XML is fully Unicode-compliant.


1 Answers

The most efficient structure would a set of classes derived from the DTD or the Schema that defines the particular XML instances you intend to process. (Surely you aren't going to process arbitrary XML?) Tags are represented by classes. Single children can be represented by fields. Childen with min...max arity can be represented by a field containing an array. Children with indefinite arity can be represented by a dynamically allocated array. Attributes and children can be stored as fields, often with an inferred data type (if an attribute represents a number, why store it as a string?). Using this approach, you can often navigate to a particular place in an XML document using native C++ accesspaths, e.g., root->tag1.itemlist[1]->description.

All of the can be generated automatically from the Schema or the DTD. There are tools to do this. Altova offers some. I have no specific experience with this (although I have built similar tools for Java and COBOL).

like image 118
Ira Baxter Avatar answered Oct 09 '22 10:10

Ira Baxter