Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read from mongodb without lock

Tags:

mongodb

We're using MongoDB 2.2.0 at work. The DB contains about 51GB of data (at the moment) and I'd like to do some analytics on the user data that we've collected so far. Problem is, it's the live machine and we can't afford another slave at the moment. I know MongoDB has a read lock which may affect any writes that happen especially with complex queries. Is there a way to tell MongoDB to treat my (particular) query with the lowest priority?

like image 318
Plasty Grove Avatar asked Feb 04 '13 15:02

Plasty Grove


People also ask

Does MongoDB transaction lock?

MongoDB uses multi-granularity locking [1] that allows operations to lock at the global, database or collection level, and allows for individual storage engines to implement their own concurrency control below the collection level (e.g., at the document-level in WiredTiger).

Does MongoDB lock document for update?

In MongoDB we recommend using the findAndModify command for this scenario. This command is atomic and thus lock the document for a status change.

What is optimistic locking MongoDB?

Optimistic locking is a workable solution for skewed writes errors. Transactions are not enough in this case because no consistency guarantees are violated. Thanks to Spring Data MongoDB versioning and retries, it is possible to handle the situation gracefully without much boilerplate code.

What is $Not in MongoDB?

$not performs a logical NOT operation on the specified <operator-expression> and selects the documents that do not match the <operator-expression> . This includes documents that do not contain the field .


1 Answers

In MongoDB reads and writes do affect each other. Read locks are shared, but read locks block write locks from being acquired and of course no other reads or writes are happening while a write lock is held. MongoDB operations yield periodically to keep other threads waiting for locks from starving. You can read more about the details of that here.

What does that mean for your use case? Because there is no way to tell MongoDB to access the data without a read lock, nor is there a way to prioritize the requests (at least not yet) whether the reads significantly affect the performance of your writes depends on how much "headroom" you have available while write activity is going on.

One suggestion I can make is when figuring out how to run analytics, rather than scanning the entire data set (i.e. doing an aggregation query over all historical data) try running smaller aggregation queries on short time slices. This will accomplish two things:

  1. reads jobs will be shorter lived and therefore will finish quicker, this will give you a chance to assess what impact the queries have on your "live" performance.
  2. you won't be pulling all old data into RAM at once - by spacing out these analytical queries over time you will minimize the impact it will have on current write performance.

Depending on what it is you can't afford about getting another server - you might consider getting a short lived AWS instance which may be not very powerful but would be available to run a long analytical query against a copy of your data set. Just be careful when making it a copy of your data - doing a full sync off of the production system will place a heavy load on it (more effective way would be to use a recent backup/file snapshot to resume from).

like image 168
Asya Kamsky Avatar answered Sep 28 '22 08:09

Asya Kamsky