Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extraordinarily poor performance from MongoDB in write intensive application [closed]

I'm running write intensive MongoDB for a web app and I'm getting extraordinarily poor performance from it. This being said, I'm fairly certain that the problem has more to do with up our code, setup and/or our usage than in mongo itself.

I'm about to crack my head open with a sledgehammer out of despair so I was wondering if anybody would mind looking at some of the outputs I've prepared to see if anything seems problematic.

  • db.stats()
  • fstab, mdadm and iostat -xm 2
  • mongostat

The code is not too complicated (it's in PHP, btw). It's pretty much a lot of ->find() and ->update(). I made sure to use indexes for both calls and confirmed they were indeed being used by doing explain() on the queries.

I've tried 1 server (ec2 m2.2xlarge), 4 servers (2 shards of 2 reps) and 9 servers (3 shards of 3 reps) and wasn't able to get much out of those.

On good moments, I can't get anything more than 1500 writes per second (insert + update). Most of the time, I'm lucky to reach a combined amount of 100 insert/update and I always have a big "locked %" and a lot of queries queued "qr|qw".

Right now, I have a script that's running and it's crawling. The worst thing is, when I look at mongostat for a while, the amount of RAM used in "res" is about 50% of the server's available RAM and there's more than enough RAM to fit the indexes of all collections. There's no reason why this isn't spitting out data like crazy.

I must have recoded the app 2-3 times already, trying to find better access patterns for the data. I've read everything I could read on indexes, updating, shard keys and what not. All the servers I put mongo on are using a 8 EBS disks raid 10 setup with some performance tweaks added (blockdev, noatime, etc…).

I know the problem is on my end and I'm not blaming mongodb. I know bigger companies than me are using it for write intensive applications and that they absolutely love it (foursquare for example). At the same time, I just can't understand what I'm doing wrong and why I'm getting such poor performance, no matter what I do.

Additional Info:

  • All servers (client and server) are running Ubuntu 10.04 LTS with MongoDB 1.8.2
  • All servers are on EC2 East and in the same zone
  • Currently, I'm back at 1 m2.2xlarge server (4 cores, 34.2 GB RAM) until I can figure out where the problem is.
like image 799
Pierre Avatar asked Aug 01 '11 17:08

Pierre


People also ask

Why MongoDB query taking long time?

If you've tried all the internal optimizations you can think of within MongoDB and your queries are still too slow, it may be time for an external index. Using an external index, your data can be indexes and queried from an entirely separate database with a completely different set of strengths and limitations.

Can MongoDB handle millions of records?

Working with MongoDB and ElasticSearch is an accurate decision to process millions of records in real-time. These structures and concepts could be applied to larger datasets and will work extremely well too.


1 Answers

So the first problem is that your disk seems to be primarily occupied with reading instead of writing. (based on iostat). Your utilization is well over 50%, but it's basically all reads.

If I look at your DB stats, you have 35GBs of indexes and 41GBs of data in 133GBs of allocated files. 133GB is pretty close to your mapped number in mongostat. So the sum of data you may be accessing is about 120GB or about 4x RAM.

Typically 4x is a perfectly fine ratio. However, in your case, you have indexes exceeding RAM. This tends to be a "falloff" point for performance in MongoDB.

If you're accessing randomly across the index, then most or all of the index is in memory. This means that most of your data is not in memory and needs to be "paged in" from disk. You can see this in the heavy disk reads that you're getting.

I know you say that you tested with sharding, do you have numbers for those tests? Was the data correctly spread across all three shards?

Sharding should alleviate your problem as you're "adding more RAM" to the DB, but you need to confirm that the data is indeed sharded evenly and behaving correctly or it can't fix your issue.

like image 180
Gates VP Avatar answered Oct 17 '22 18:10

Gates VP