Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

CouchDb - MongoDb and NoSQL Databases Comparision (working with XML Documents)

I am working on a project using Java and Spring 3. There is a new task for me. There will be Xml files and I get that files and convert them into Objects. After that I will put them into a database.

The main topic for me to examine nosql databases. CouchDb and MongoDb are the databases I should search. I will make search on that objects(one of the index type will be date and I will make date between selects) at database. Performance is so important for me and

I will work on a huge data thats why I should search nosql databases.

What do you suggest according to my scenario, what are pros/cons of them and which one I should choose and why?

I searched and see that Couch DB uses a REST API and Mongo DB uses drivers and it is performance plus for Mongo according to here: http://www.mongodb.org/display/DOCS/Comparing+Mongo+DB+and+Couch+DB

However Couch DB uses replication a way to scale(is it a performance plus?)

Also I realize that there are BaseX and eXist. According to my need what do you suggest did anyone worked with them?

PS:Also I will get XML files as like logs. They will not change and I won't manipulate data on it.

like image 968
kamaci Avatar asked Nov 05 '22 13:11

kamaci


2 Answers

This is a pretty big question but I will do my best to tackle it. A company I work for was making the change from developing our applications with Mysql to NoSQL and i was the lead on the first NoSQL database, we were deciding which NoSQL database to work with. I was between MongoDB, CouchDB and Cassandra. One important factor I had to look at was, how easy will it be to write base line functions to work with the database so u don't have to understand what is going on but still able to execute querys and so on. The issue with cassandra was there API was super low level and would take some time to write a solid high level interface and we did not have that kind of time. The issue with couchdb was the REST service. Since we were already connecting to our inhouse api using rest it would have been a double rest service. REST generally goes over http and there is a fair amount of over head for http to be as easy to work with has it is. And that over head adds time to loading information. So we took mongodb for that reason and many other reasons. Also since its a driver it is developed to work with the programming language which is great if your language is supported sucks if its not. Since Java is supported by mongodb then its fine.

I would recommend converting the XML files in to objects and then storing the objects in mongo. so each XML file would be embedded mongodocuments the great thing about mongo is you can search embedded documents and u can index them. So enjoy hat

like image 172
WojonsTech Avatar answered Nov 09 '22 05:11

WojonsTech


I have only used MongoDB in a high-data-volume, low-load internal application, so I cannot really offer first hand advice for your choice.

The MongoDB people, however, have a comparison with CouchDB here. There are also quite a few more independent opinions (1, 2).

You should also consider the quality of the available database drivers for your environment. The Java MongoDB driver is quite stable, in my experience, but it seems to me that it still incurs more processing overhead than it should. I have not idea about any of the CouchDB drivers.

Do you have any other requirements apart from the ability to store large amounts of data? Do you need replication or sharding?

PS: How are you storing the XML files anyway? XML files do not map into JSON (which is what e.g. MongoDB uses) perfectly - unless you store the whole XML text in a single field.

PS2: Are you sure that you need a document-based database? If you are only going to perform searches on a few fields that are known beforehand, a relational DB might be easier to handle. Document-based DBs start making sense only when you don't have a predefined schema for your data or when you need to store more complex object hierarchies.

PS3: May I ask why huge data implies NoSQL to you? You can store insane amounts of data on any modern relational database (as long as you have the hardware, of course).

EDIT:

A couple of related SO questions:

  • MongoDB versus CouchDB... And any other "major players"
  • NoSQL - MongoDB vs CouchDB
  • Which of CouchDB or MongoDB suits my needs?

(...and about a thousand more)

Maybe also these:

  • When should I use a NoSQL database instead of a relational database? Is it okay to use both on the same site?
  • Using a NoSQL database over MySQL
like image 27
thkala Avatar answered Nov 09 '22 05:11

thkala