We are trying to build a notification application for our users with mongo. we created 1 mongodb on 10GB RAM, 150GB SAS HDD 15K RPM, 4 Core 2.9GHZ xeon intel XEN VM.
DB schema :-
{
"_id" : ObjectId("5178c458e4b0e2f3cee77d47"),
"userId" : NumberLong(1574631),
"type" : 2,
"text" : "a user connected to B",
"status" : 0,
"createdDate" : ISODate("2013-04-25T05:51:19.995Z"),
"modifiedDate" : ISODate("2013-04-25T05:51:19.995Z"),
"metadata" : "{\"INVITEE_NAME\":\"2344\",\"INVITEE\":1232143,\"INVITE_SENDER\":1574476,\"INVITE_SENDER_NAME\":\"123213\"}",
"opType" : 1,
"actorId" : NumberLong(1574630),
"actorName" : "2344"
}
DB stats :-
db.stats()
{
"db" : "UserNotificationDev2",
"collections" : 3,
"objects" : 78597973,
"avgObjSize" : 489.00035699393925,
"dataSize" : 38434436856,
"storageSize" : 41501835008,
"numExtents" : 42,
"indexes" : 2,
"indexSize" : 4272393328,
"fileSize" : 49301946368,
"nsSizeMB" : 16,
"dataFileVersion" : {
"major" : 4,
"minor" : 5
},
"ok" : 1
}
index :- userid and _id
we are trying to select latest 21 notifications for one user.
db.userNotification.find({ "userId" : 53 }).limit(21).sort({ "_id" : -1 });
but this query is taking too much time. Fri Apr 26 05:39:55.563 [conn156] query UserNotificationDev2.userNotification query: { query: { userId: 53 }, orderby: { _id: -1 } } cursorid:225321382318166794 ntoreturn:21 ntoskip:0 nscanned:266025 keyUpdates:0 numYields: 2 locks(micros) r:4224498 nreturned:21 reslen:10295 2581ms
even count is taking hell lot of time.
Fri Apr 26 05:47:46.005 [conn159] command UserNotificationDev2.$cmd command: { count: "userNotification", query: { userId: 53 } } ntoreturn:1 keyUpdates:0 numYields: 11 locks(micros) r:9753890 reslen:48 5022ms
Are we doing some wrong in the query?
Please help!!!
Also suggest if our schema is not correct to storing user notifications. we have tried a embedded notifications like user and then notification for that user under that document but document limit is limiting us to store only ~50k notifications. so we changed to this.
You are querying by userId but not indexing it anywhere. My suggestion is to create an index on { "userId" : 1, "_id" : -1 }. This will create an index tree that starts with userId, then _id, which is almost exactly what your query is doing. This is the simplest/most flexible way to speeding up your query.
Another, more memory efficient, approach is to store your userId and timestamp as a string in _id, like _id : "USER_ID:DATETIME. Ex :
{_id : "12345:20120501123000"}
{_id : "15897:20120501124000"}
{_id : "15897:20120501125000"}
Notice _id is a string, not MongoId. Then your query above becomes a regex :
db.userNotification.find({ "_id" : /^53:/ }).limit(21).sort({ "_id" : -1 });
As expected, this will return all notifications for userId 53 in descending order. The memory efficient part is two fold:
Re: count. Count does take time because it scans through the entire collection.
Re: your schema. I'm guessing for your data set this is the best way to utilize your memory. When objects get large and your queries scan across multiple objects they will need to be loaded into memory in their entirety (I've had the OOM killer kill my mongod instance when i sorted with 2000 2MB objects on a 2GB RAM machine). With large objects your RAM usage will fluctuate greatly (not to mention they are limited upto a point). With your current schema mongo will have a much easier time loading only the data you're querying, resulting in less swapping and more consistent memory usage patterns.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With