Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Storing large XML in MongoDB

Tags:

xml

mongodb

I have a pretty huge xml (>10mb in size & 40+ elements). Currently we store such xml in Oracle db and use xquery to query and retrieve parts of the xml. This process is slow and takes many db calls. We are exploring mongodb to store this xml and query it. I justed converted the xml to json and loaded into a mongo collection and it stored the huge json data in a flash. And it stores the xml nodes as nested docs. But when I query (using find) for a inner most element, it always returns the whole doc, containing nodes with non-matching element values also. I expect only few nodes that matches the given node value. Let me know if there is any best way to store such large xml files in mongo db. And also let me know how to retrieve the inner nodes having exact values specified in the query. Thanks in advance.

like image 570
Venkiram Avatar asked Oct 10 '11 10:10

Venkiram


People also ask

Can we store XML data in MongoDB?

MongoDB doesn't support xml document. All documents in mongodb are stored in BSON format. Yes, you can do that.

Can MongoDB store large files?

Large objects, or "files", are easily stored in MongoDB. It is no problem to store 100MB videos in the database. This has a number of advantages over files stored in a file system. Unlike a file system, the database will have no problem dealing with millions of objects.

Can MongoDB handle millions of records?

Working with MongoDB and ElasticSearch is an accurate decision to process millions of records in real-time. These structures and concepts could be applied to larger datasets and will work extremely well too.

What is the maximum size of document in MongoDB?

The maximum size an individual document can be in MongoDB is 16MB with a nested depth of 100 levels. Edit: There is no max size for an individual MongoDB database.


2 Answers

Have you thought about trying an up-to-date XML Database, such as BaseX (http://basex.org)? It might give you much better results, in particular if you have used XQuery before anyway.

like image 180
Hannes Bauer Avatar answered Sep 26 '22 18:09

Hannes Bauer


I had the same problem. In my case the top-level node in each XML file always contained a huge list of smaller nodes, so I ended up storing those items instead. To do it, I wrote my own xml-to-json command line tool. I've used it to convert 10GB of XML data into JSON, in a format that mongoimport can eat.

like image 28
sinelaw Avatar answered Sep 24 '22 18:09

sinelaw