MongoDB Fetching documents slow (Indexing used)

Tags:

The FETCH-stage is the limiting factor in my queries. Ive been reaserching and it seems that mongodb is reading much more than it needs, and not utilize the bandwidth fully.

My mongoDB-mongod instance seems to be reading to much on a single query. Testing on AWS EC2 m4.xlarge with 1 EBS io 5000Piops attached (100GB) SSD. 16 GB ram.

The machine only contains the mongodb instance for testing purpose.
The database is roughly 60GB (on Disk) in total ( several collections).
The main collections is used in the following scenarios and queries.

db stats

db.stats()
{
    "db" : "database",
    "collections" : 4,
    "objects" : 406496932,
    "avgObjSize" : 326.3196544642064,
    "dataSize" : 132647938391,
    "storageSize" : 55475830784,
    "numExtents" : 0,
    "indexes" : 5,
    "indexSize" : 8940408832,
    "ok" : 1
 }

collection summary:

db.collection.stats()  ->  
{    "ns" : "database.[collection###]",
    "count" : 367614513,
    "size" : 121155225858,
    "avgObjSize" : 329,
    "storageSize" : 52052197376,
    "capped" : false,
    "wiredTiger" : {"Left empty"},
    "nindexes" : 2,
    "totalIndexSize" : 8131604480,
    "indexSizes" : {
            "_id_" : 4373012480,
            "id_1_ts_-1" : 3758592000
    },
    "ok" : 1

The query:

db.[#######].find({ id : "######", 
   ts : { 
    "$gte" :
       ISODate("2016-10-01T00:00:00.000Z"), 
     $lt :
       ISODate("2016-10-07T02:00:00.000Z")
}}, {_id : 0,"u1"
     :1,"u2":1,"u3":1,"eq1" :1 ,"eq2" : 1,"eq3": 1,"ts" :1});

And the explain result:

{
    "queryPlanner" : {
            "plannerVersion" : 1,
            "namespace" : "database.[collection]",
            "d" : false,
            "parsedQuery" : {
                    "$and" : [
                            {
                                    "id" : {
                                            "$eq" : "#####ID#####"
                                    }
                            },
                            {
                                    "ts" : {
                                            "$lt" : ISODate("2016-09-30T22:00:00Z")
                                    }
                            },
                            {
                                    "ts" : {
                                            "$gte" : ISODate("2016-09-22T22:00:00Z")
                                    }
                            }
                    ]
            },
            "winningPlan" : {
                    "stage" : "PROJECTION",
                    "transformBy" : {
                            "_id" : 0,
                            "u1" : 1,
                            "u2" : 1,
                            "u3" : 1,
                            "eq1" : 1,
                            "eq2" : 1,
                            "eq3" : 1,
                            "ts" : 1
                    },
                    "inputStage" : {
                            "stage" : "FETCH",
                            "inputStage" : {
                                    "stage" : "IXSCAN",
                                    "keyPattern" : {
                                            "id" : 1,
                                            "ts" : -1
                                    },
                                    "indexName" : "id_1_ts_-1",
                                    "isMultiKey" : false,
                                    "isUnique" : false,
                                    "isSparse" : false,
                                    "isPartial" : false,
                                    "indexVersion" : 1,
                                    "direction" : "forward",
                                    "indexBounds" : {
                                            "id" : [
                                                    "[\"#####ID#####\", \"#####ID#####\"]"
                                            ],
                                            "ts" : [
                                                    "(new Date(1475272800000), new Date(1474581600000)]"
                                            ]
                                    }
                            }
                    }
            },
            "rejectedPlans" : [ ]
    },
    "executionStats" : {
            "executionSuccess" : true,
            "nReturned" : 676745,
            "executionTimeMillis" : 170508,
            "totalKeysExamined" : 676745,
            "totalDocsExamined" : 676745,
            "executionStages" : {
                    "stage" : "PROJECTION",
                    "nReturned" : 676745,
                    "executionTimeMillisEstimate" : 167820,
                    "works" : 676746,
                    "advanced" : 676745,
                    "needTime" : 0,
                    "needYield" : 0,
                    "saveState" : 8970,
                    "restoreState" : 8970,
                    "isEOF" : 1,
                    "invalidates" : 0,
                    "transformBy" : {
                            "_id" : 0,
                            "u1" : 1,
                            "u2" : 1,
                            "u3" : 1,
                            "eq1" : 1,
                            "eq2" : 1,
                            "eq3" : 1,
                            "ts" : 1
                    },
                    "inputStage" : {
                            "stage" : "FETCH",
                            "nReturned" : 676745,
                            "executionTimeMillisEstimate" : 166470,
                            "works" : 676746,
                            "advanced" : 676745,
                            "needTime" : 0,
                            "needYield" : 0,
                            "saveState" : 8970,
                            "restoreState" : 8970,
                            "isEOF" : 1,
                            "invalidates" : 0,
                            "docsExamined" : 676745,
                            "alreadyHasObj" : 0,
                            "inputStage" : {
                                    "stage" : "IXSCAN",
                                    "nReturned" : 676745,
                                    "executionTimeMillisEstimate" : 980,
                                    "works" : 676746,
                                    "advanced" : 676745,
                                    "needTime" : 0,
                                    "needYield" : 0,
                                    "saveState" : 8970,
                                    "restoreState" : 8970,
                                    "isEOF" : 1,
                                    "invalidates" : 0,
                                    "keyPattern" : {
                                            "id" : 1,
                                            "ts" : -1
                                    },
                                    "indexName" : "id_1_ts_-                                                                   1",
                                    "isMultiKey" : false,
                                    "isUnique" : false,
                                    "isSparse" : false,
                                    "isPartial" : false,
                                    "indexVersion" : 1,
                                    "direction" : "forward",
                                    "indexBounds" : {
                                            "id" : [
                                                    "[\"#####ID#####\", \"#####ID#####\"]"
                                            ],
                                            "ts" : [
                                                    "(new Date(1475272800000), new Date(1474581600000)]"
                                            ]
                                    },
                                    "keysExamined" : 676745,
                                    "dupsTested" : 0,
                                    "dupsDropped" : 0,
                                    "seenInvalidated" : 0
                            }
                    }
            },
            "allPlansExecution" : [ ]
    },
    "serverInfo" : {
            "host" : "ip #########",
            "port" : 27017,
            "version" : "3.2.10",
            "gitVersion" : "79d9b3ab5ce20f51c272b4411202710a082d0317"
    },
    "ok" : 1

}

As we can see above mongoDb uses index. The IXSCAN takes 980ms and the FETCH ~160000ms.

If im not mistaking the whole read should be 676746(nReturned) * 329(avgObjSize ) Bytes = ~212 MB of data.

Ive notice that in the iostats (http://linuxcommand.org/man_pages/iostat1.html) the following(/data/db is on xvdf):

  vg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.27    0.00    0.00   21.35    0.13   78.25
Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
xvda              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
xvdf              0.00     0.00 1691.00    0.00    19.83     0.00    24.02     0.95    0.56    0.56    0.00   0.56  94.40

The rMB/s says ~20MB/s and this continious during the whole operation (fetch-stage). This means that mongodb is reading 160 s * 20MB/s = 3 200 MB, which is alot more than the 200 MB above.

Memory:

    free -m
             total       used       free     shared    buffers     cached
   Mem:         16048      12629       3418          0         32       4071
   -/+ buffers/cache:       8525       7522
  Swap:            0          0

Also mongodb not using the 5000 iops EBS provisioned, nor the bandwidth promised? Only using ~1700 Reads/Second resulting in ~20MB/s.

Ive changed the read-ahead to 16KB. Ive tried placing journal and log at another HDD.

I cannot figure this out! Help me. Please!

839

asked Oct 28 '16 14:10

Villeviktor

2 Answers

I encountered the same problem when I was fetching around 35000 documents. To solve it, I used the aggregate function (sakulstra:aggregate) and in my case it has incredibly boosted the request. The result format is obviously not the same, but it's still easy to use to compute all things I need.

Before (7000ms) :

const historicalAssetAttributes = HistoricalAssetAttributes.find({
        date:{'$gte':startDate,'$lte':endDate},
        assetId: {$in: assetIds}
    }, {
        fields:{
            "date":1,
            "assetId":1,
            "close":1
        }
    }).fetch();

After (300ms):

const historicalAssetAttributes = HistoricalAssetAttributes.aggregate([
        {
            '$match': {
                date: {'$gte': startDate, '$lte': endDate},
                assetId: {$in: assetIds}
            }
        }, {
            '$group':{
                _id: {assetId: "$assetId"},
                close: {
                    '$push': {
                        date: "$date",
                        value: "$close"
                    }
                }
            }
        }
    ]);

answered Sep 16 '22 23:09

Holgrabus

The main facts are:

The machine has 16 GB of RAM
The collection in question is 112 GB uncompressed (~51 GB compressed)
The collection's index total size is ~7 GB
The collection contains 367,614,513 documents
The bulk of time is spent fetching documents for projection. This takes 166470 ms (~166 seconds). Index scan for the query takes only 980 ms (<1 second).

Assuming WiredTiger cache is set to default, the amount of RAM reserved for the WiredTiger cache should be approximately 8.6 GB. In https://docs.mongodb.com/v3.2/faq/storage/#to-what-size-should-i-set-the-wiredtiger-internal-cache:

Starting in MongoDB 3.2, the WiredTiger internal cache, by default, will use the larger of either:

60% of RAM minus 1 GB, or

1 GB.

From the information above, it appears that there is a memory pressure on your machine. MongoDB tries to keep indexes in memory for fast access, and the whole index is ~7 GB. This will effectively fill ~80% of your WiredTiger cache with just the index, leaving little space for anything else. As a result, MongoDB is forced to pull the documents in the result set from disk. At this point, performance suffers.

You can see the effect of this from the iostat output, where the device xvdf (where the data resides) hitting more than 94% utilization (shown in the %util column), which means that your operation is I/O bound as you don't have enough RAM to satisfy your ideal working set.

To mitigate this issue, you could try to:

Provision a larger RAM for your deployment
If applicable, use a cursor to return the documents instead of trying to access the whole result set all at once

You could also review the Production Notes and the Operation Checklist for recommended settings.

answered Sep 17 '22 23:09

kevinadi

Related questions
                            
                                Get the updated document in MongoDB
                            
                                Getting large rows out of SQL Azure - but where to go? Tables, Blob or something like MongoDB?
                            
                                Meteor: Could a race condition happen with Meteor.collections on server side?
                            
                                Can't reach Eve REST API
                            
                                $projection vs $elemMatch
                            
                                MongoDB Nested Array Intersection Query
                            
                                Mongodb count distinct with multiple group fields
                            
                                How to use .slice in mongoose
                            
                                mongoid - querying embedded docs
                            
                                mongodb and pymongo 16Mb limit on document size
                            
                                MongoDB Aggregation Query- Rename Fields Returned from Within Embedded Documents
                            
                                The correct way of storing document reference in one-to-one relationship in MongoDB
                            
                                save password as salted hash in mongodb in users collection using python/bcrypt
                            
                                MongoDB: $push Multiple Objects, and $pop Multiple Objects in Single Update
                            
                                Unable to start MongoDB 3.0.2 service on CentOS 7
                            
                                How to create both a customer and a card in a single action with Stripe?
                            
                                How to update record in MongoDB without replacing the existing fields?
                            
                                New C++ Mongo driver: how to see type and how to get string value
                            
                                AttributeError: 'list' object has no attribute 'items' in a scrapy
                            
                                cursor.toArray() returns a promise instead of array

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

MongoDB Fetching documents slow (Indexing used)

Tags:

performance

mongodb

amazon-web-services

amazon-ec2

Villeviktor

People also ask

2 Answers

Holgrabus

kevinadi

Recent Activity

Donate For Us