Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does it mean to fit "working set" into RAM for MongoDB?

Tags:

mongodb

MongoDB is fast, but only when your working set or index can fit into RAM. So if my server has 16G of RAM, does that mean the sizes of all my collections need to be less than or equal to 16G? How does one say "ok this is my working set, the rest can be "archived?"

like image 663
sdot257 Avatar asked Jun 23 '11 11:06

sdot257


People also ask

What is a working set in MongoDB?

When reading up on MongoDB you'll most likely run into the word “Working Set”. This is the data that your application is constantly requesting. If your “Working Set” all fits in RAM then all access will be fast as the operating system will not have to swap to and from disk as much.

How much RAM do I need for MongoDB?

MongoDB requires approximately 1 GB of RAM per 100.000 assets. If the system has to start swapping memory to disk, this will have a severely negative impact on performance and should be avoided.

Does MongoDB use RAM?

MongoDB will allocate per default 50 % of (RAM - 1GB), so we have in this example 63,5 GB RAM for MongoDB. 63,5 GB minus 23,5 GB for the indexes will make 40 GB remaining for documents. from the mongod.

How do I set memory limit in MongoDB?

MongoDB, in its default configuration, will use will use the larger of either 256 MB or ½ of (ram – 1 GB) for its cache size. You can limit the MongoDB cache size by adding the cacheSizeGB argument to the /etc/mongod. conf configuration file, as shown below.


2 Answers

"Working set" is basically the amount of data AND indexes that will be active/in use by your system.

So for example, suppose you have 1 year's worth of data. For simplicity, each month relates to 1GB of data giving 12GB in total, and to cover each month's worth of data you have 1GB worth of indexes again totalling 12GB for the year.

If you are always accessing the last 12 month's worth of data, then your working set is: 12GB (data) + 12GB (indexes) = 24GB.

However, if you actually only access the last 3 month's worth of data, then your working set is: 3GB (data) + 3GB (indexes) = 6GB. In this scenario, if you had 8GB RAM and then you started regularly accessing the past 6 month's worth of data, then your working set would start to exceed past your available RAM and have a performance impact.

But generally, if you have enough RAM to cover the amount of data/indexes you expect to be frequently accessing then you will be fine.

Edit: Response to question in comments
I'm not sure I quite follow, but I'll have a go at answering. Firstly, the calculation for working set is a "ball park figure". Secondly, if you have a (e.g.) 1GB index on user_id, then only the portion of that index that is commonly accessed needs to be in RAM (e.g. suppose 50% of users are inactive, then 0.5GB of the index will be more frequently required/needed in RAM). In general, the more RAM you have, the better especially as working set is likely to grow over time due to increased usage. This is where sharding comes in - split the data over multiple nodes and you can cost effectively scale out. Your working set is then divided over multiple machines, meaning the more can be kept in RAM. Need more RAM? Add another machine to shard on to.

like image 177
AdaTheDev Avatar answered Oct 01 '22 01:10

AdaTheDev


The working set is basically the stuff you are using most (frequently). If you use index A for collection B to search for a subset of documents then you could consider that your working set. As long as the most commonly used parts of those structures can fit in memory then things will be exceedingly fast. As parts no longer fit in your working set, like many of the documents then that can slow down. Generally things will become much slower if your indexes exceed your memory.

Yes, you can have lots of data, where most of it is "archived" and rarely used without affecting the performance of our application or impacting your working set (which doesn't include that archived data).

like image 45
Scott Hernandez Avatar answered Oct 01 '22 03:10

Scott Hernandez