Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

understand MongoDB cache system

Tags:

mongodb

This is a basic question, but very important, and i am not sure to really get the point.

On the official documentation we can read

MongoDB keeps all of the most recently used data in RAM. If you have created indexes for your queries and your working data set fits in RAM, MongoDB serves all queries from memory.

The part i am not sure to understand is

If you have created indexes for your queries and your working data set fits in RAM

what does mean "indexes" here?

For example, if i update a model, then i query it, because i have updated it, it's now in RAM so it will come from the memory, but this is not very clear in my mind.

How can we be sure that datas we query will come from the memory or not? I understand that MongoDB uses the free memory to cache datas about the memory which is free on the moment, but does someone could explain further the global behavior ?

In which case could it be better to use a variable in our node server which store datas than trust the MongoDB cache system?

How do you globally advise to use MongoDB for huge traffic?

like image 889
Ludo Avatar asked Jul 08 '13 19:07

Ludo


People also ask

How does MongoDB cache work?

MongoDB keeps most recently used data in RAM. If you have created indexes for your queries and your working data set fits in RAM, MongoDB serves all queries from memory. MongoDB does not cache the query results in order to return the cached results for identical queries.

Does MongoDB have in memory cache?

The short answer is MongoDB relies on both its internal memory caches as well as the operating system's cache. The OS cache generally is seen as “Unallocated” by sysadmins, dba's, and devs. This means they steal memory from the OS and allocate it internally to MongoDB.

How does database cache work?

A database cache supplements your primary database by removing unnecessary pressure on it, typically in the form of frequently accessed read data. The cache itself can live in a number of areas including your database, application or as a standalone layer.

Is MongoDB cache friendly?

Can MongoDB be used as an in-memory cache? MongoDB uses persistent storage, so by definition it is not an in-memory cache. But sure, if accessing MongoDB - which goes over the network - is faster than retrieving data from the source or computing it, then it can be used as a cache.


2 Answers

Note: This was written back in 2013 when MongoDB was still quite young, it didn't have the features it does today, while this answer still holds true for mmap, it does not for the other storage technologies MongoDB now implements, such as WiredTiger, or Percona.


A good place to start to understand exactly what is an index: http://docs.mongodb.org/manual/core/indexes/

After you have brushed up on that you will udersand why they are so good, however, skipping forward to some of the more intricate questions.

How can we be sure that datas we query will come from the memory or not?

One way is to look at the yields field on any query explain(). This will tell you how many times the reader yielded its lock because data was not in RAM.

Another more indepth way is to look on programs like mongostat and other such programs. These programs will tell you about what page faults (when data needs to be paged into RAM from disk) are happening on your mongod.

I understand that MongoDB uses the free memory to cache datas about the memory which is free on the moment, but does someone could explain further the global behavior ?

This is actually incorrect. It is easier to just say that MongoDB does this but in reality it does not. It is in fact the OS and its own paging algorithms, usually the LRU, that does this for MongoDB. MongoDB does cache index plans for a certain period of time though so that it doesn't have to constantly keep checking and testing for indexes.

In which case could it be better to use a variable in our node server which store datas than trust the MongoDB cache system?

Not sure how you expect that to work...I mean the two do quite different things and if you intend to read your data from MongoDB into your application on startup into that var then I definitely would not recommend it.

Besides OS algorithms for memory management are extremely mature and fast, so it is ok.

How do you globally advise to use MongoDB for huge traffic?

Hmm, this is such a huge question. Really I would recommend you Google a little in this subject but as the documentation states you need to ensure your working set fits into RAM for one.

Here is a good starting point: What does it mean to fit "working set" into RAM for MongoDB?

like image 105
Sammaye Avatar answered Oct 06 '22 05:10

Sammaye


MongoDB attempts to keep entire collections in memory: it memory-maps each collection page. For everything to be in memory, both the data pages, and the indices that reference them, must be kept in memory.

If MongoDB returns a record, you can rest assured that it is now in memory (whether it was before your query or not).

MongoDB doesn't keep a "cache" of records in the same way that, say, a web browser does. When you commit a change, both the memory and the disk are updated.

Mongo is great when matched to the appropriate use cases. It is very high performance if you have sufficient server memory to cache everything, and declines rapidly past that point. Many, many high-volume websites use MongoDB: it's a good thing that memory is so cheap, now.

like image 24
Curt Avatar answered Oct 06 '22 04:10

Curt