Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB performance - having multiple databases

Our application needs 5 collections in a db. When we add clients to our application we would like to maintain separate db for each customer. For example, if we have 500 customers, we would have 500 dbs and 2500 collections (each db has 5 collection). This way we can separate each customer data. My concern is, will it lead to any performance problems?

UPDATE: Also follow this google-group discussion.

like image 844
user10 Avatar asked Jun 04 '13 11:06

user10


People also ask

How many databases can MongoDB handle?

By default, you can run some 12,000 collections in a single instance of MongoDB( that is, if each collection also has 1 index).

Which database in MongoDB DB has multiple?

A database is a container for collections. Each database gets its own set of files on the host file system. A single MongoDB server typically has multiple databases.

Does MongoDB Sharding improve performance?

Sharded clusters in MongoDB are another way to potentially improve performance. Like replication, sharding is a way to distribute large data sets across multiple servers. Using what's called a shard key, developers can copy pieces of data (or “shards”) across multiple servers.

What makes MongoDB fast?

MongoDB is fast because its web scale! Its a fun video and well worth everyone watching, but it does answer your question - that most of the noSQL engines like MongoDB are not robust and not resilient to crashes and other outages. This security is what they sacrifice to gain speed.


1 Answers

Our application needs 5 collections in a db. When we add clients to our application we would like to maintain separate db for each customer. For example, if we have 500 customers, we would have 500 dbs and 2500 collections (each db has 5 collection). This way we can separate each customer data.

That's a great idea. On top of logical separation this will provide for you, you will also be able to use database level security in MongoDB to help prevent inadvertent access to other customers' data.

My concern is, will it lead to any performance problems?

No, and in fact it will help as with database level lock extremely heavy lock contention for one customer (if that's possible in your scenario) would not affect performance for another customer (it still might if they are competing for the same I/O bandwidth but if you use --directoryperdb option then you have the ability to place those DBs on separate physical devices.

Sharding will also allow easy scaling as you won't even have to partition any collections - you can just round-robin databases across multiple shards to allow the load to be distributed to separate clusters (if and when you reach that level).

Contrary to the claim in the other answer, TTLMonitor thread does NOT pull documents into RAM unless they are being deleted (and added to the free list). They work off of TTL indexes both to tell if any documents are to be expired as well as to located the document directly.

I would strongly recommend against the one database many collections solution as that doesn't allow you to either partition the load, nor provide security, nor is it any easier to handle on the application side.

like image 181
Asya Kamsky Avatar answered Sep 24 '22 04:09

Asya Kamsky