Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

It it possible to make $unwind fast in MongoDB by having an index?

Tags:

mongodb

Inside a collection I have objects that contain an array and I would like to look for certain objects within that array without looking at the whole array. The objects in my collection look like this:

{
    "transactions": [
        {"id": randint(0, 100000), "hello": randint(0, 1000)} for _ in range(100000)
    ]
}

And I would like to get all the transactions that have the id 17 within the collection. So I created this index:

db.toto.createIndex({'transactions.id': 1})

But to look at only the transactions I want I have to do an $unwind and this unwind is still slow:

db.toto.aggregate(
        [
            {"$match": {"transactions.id": 17}},
            {"$unwind": "$transactions"},
            {"$match": {"transactions.id": 17}},
        ]
    )

Gives me

    [{'_id': ObjectId('5bf854f685699a394ce5ba82'),
  'transactions': {'hello': 920, 'id': 17}},
 {'_id': ObjectId('5bf854f685699a394ce5ba82'),
  'transactions': {'hello': 446, 'id': 17}},
 {'_id': ObjectId('5bf854f685699a394ce5ba84'),
  'transactions': {'hello': 822, 'id': 17}},
 {'_id': ObjectId('5bf854f685699a394ce5ba84'),
  'transactions': {'hello': 830, 'id': 17}},
 [...]
 {'_id': ObjectId('5bf854f885699a394ce5ba89'),
  'transactions': {'hello': 301, 'id': 17}},
 {'_id': ObjectId('5bf854f985699a394ce5ba8b'),
  'transactions': {'hello': 666, 'id': 17}}]

Adding the first $match makes the query slightly faster because it does use the index to find only the objects that contain the transaction I am looking for. But it will not use the index to make the $unwind faster. MongoDB still goes through the whole array that contains 100000 transactions to find the transactions I want.

The query takes 5 seconds to find about 100 objects. While a query like this db.toto.count({"transactions.id": 17}) that does use the index takes less than 0.1 second.

Here is the python file I used to study the issue. You can reproduce the issue by doing:

pip3 install fire pymongo
chmod +x toto_mongo.py
./toto_mongo.py insert
./toto_mongo.py create_index
time ./toto_mongo.py slow_query
like image 472
nevare Avatar asked Nov 17 '22 20:11

nevare


1 Answers

You can use $lookup and then unwind it using $unwind.

Something like this you can use in your backend routes.

               {
                    $lookup: {
                        from: "customers",
                        localField: "customer",
                        foreignField: "_id",
                        as: "customerData"
                    }
                },
                { $unwind: "$customerData" },

Where as your schema would look something like below:

var mongoose = require('mongoose');
var Schema = mongoose.Schema;
var ObjectId = mongoose.Types.ObjectId;

var moviesSchema = new Schema({
    movieId: String,
    title: String, 
    customer: { type: Schema.Types.ObjectId, ref: 'customers', index: true },
    genre: String,
    releaseDate: Date,
    ratings: Number,
    review: String,
    reviewTime: Date,
});
var movieState = mongoose.model('movies', moviesSchema);

module.exports = movieState;
like image 142
itiDi Avatar answered Nov 19 '22 09:11

itiDi