Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mongodb - are reliability issues significant still?

I have a couple of sqlite dbs (i'd say about 15GBs), with about 1m rows in total - so not super big. I was looking at mongodb, and it looks pretty easy to work with, especially if I want to try and do some basic natural language processing on the documents which make up the databases.

I've never worked with Mongo in the past, no would have to learn from scratch (will be working in python). After googling around a bit, I came across a number of somewhat horrific stories about Mongodb re. reliability. Is this still a major problem ? In a crunch, I will of course retain the sqlite backups, but I'd rather not have to reconstruct my mongo databases constantly.

Just wondering what sort data corruption issues people have actually faced recently with Mongo ? Is this a big concern?

Thanks!

like image 393
malangi Avatar asked Aug 15 '10 13:08

malangi


People also ask

How reliable is MongoDB?

All MongoDB Atlas clusters are highly available and backed by an industry-leading uptime SLA of 99.995% across all cloud providers.

Is MongoDB good for update?

MongoDB, unfortunately, does not support transactions. So if you need to update more than one document or collection per user request, don't use MongoDB. It may lead to corrupted data, as there is no ACID guarantee. Rollbacks have to be handled by your application.


2 Answers

As others have said, MongoDB does not have single-server durability right now. Fortunately, it's dead easy to set up multi-node replication. You can even set up a second machine in another data center and have data automatically replicated to it live!

If a write must succeed, you can cause Mongo to not return from an insert/update until that data has been replicated to n slaves. This ensures that you have at least n copies of the data. Replica sets allow you to add and remove nodes from your cluster on the fly without any significant work; just add a new node and it'll automatically sync a copy of the data. Remove a node and the cluster rebalances itself. It is very much designed to be used across multiple machines, with multiple nodes acting in parallel; this is it's preferred default setup, compared to something like MySQL, which expects one giant machine to do its work on, which you can then pair slaves against when you need to scale out. It's a different approach to data storage and scaling, but a very comfortable one if you take the time to understand its difference in assumptions, and how to build an architecture that capitalizes on its strengths.

like image 180
Chris Heald Avatar answered Sep 30 '22 13:09

Chris Heald


Yes, durability is a big problem in mongo. You have to use replication sets in mongodb for durability (you need at least 2 machines), otherwise you can loose upto last 1 minute on a power fail for example. There is no single server durability in mongo, but it'll be developed for 1.7-1.8 as I know. After a crash you have to repair db manually and rapair operation may took hours if your data is large. There is no transaction or acid, so it's not suitable for an ecommerce or banking application.

You should not use development versions of mongo (odd versiond number like 1.3.x,1.5.x,1.7.x are development versions) and you prefer to use 64 bit operating systems. If you digg into disaster articles on the web about mongo, the source of the problem is these two ones in most cases.

CouchDB, Cassandra and postgresql all have strong durability (fsync is 10 milliseconds by default in cassandra and postgresql), so they all have single server durability.

If you need dead easy scalability, fault tolerance and load balancing; cassandra is the best, but with poor query options. Failing nodes may go away and come back after a period of time, no problem, system auto repairs itself.

EDIT: mongo 1.8 came with journaling (allows durability) but it's not the default setting. Also take look at this http://news.ycombinator.com/item?id=2684423

Regards,

Serdar Irmak

like image 35
sirmak Avatar answered Sep 30 '22 15:09

sirmak