MongoDB embedded vs array sub document performance

Tags:

Given the below competing schemas with up to 100,000 friends I’m interested in finding the most efficient for my needs.

Doc1 (Index on user_id)

{
"_id" : "…",
"user_id" : "1",
friends : {
    "2" : {
        "id" : "2",
        "mutuals" : 3
    }
     "3" : {
         "id" : "3",
         "mutuals": "1"
    }
   "4" : {
         "id" : "4",
         "mutuals": "5"
    }
}
}

Doc2 (Compound multi key index on user_id & friends.id)

{
"_id" : "…",
"user_id" : "1",
friends : [
   {
        "id" : "2",
        "mutuals" : 3
    },
    {
         "id" : "3",
         "mutuals": "1"
    },
   {
         "id" : "4",
         "mutuals": "5"
    }
]}

I can’t seem to find any information on the efficiency of the sub field retrieval. I know that mongo implements data internally as BSON, so I’m wondering whether that means a projection lookup is a binary O(log n)?

Specifically, given a user_id to find whether a friend with friend_id exists, how would the two different queries on each schema compare? (Assuming the above indexes) Note that it doesn’t really matter what’s returned, only that not null is returned if the friend exists.

Doc1col.find({user_id : "…"}, {"friends.friend_id"})
Doc2col.find({user_id : "…", "friends.id" : "friend_id"}, {"_id":1})

Also of interest is how the $set modifier works. For schema 1,given the query Doc1col.update({user_id : "…"}, {"$set" : {"friends.friend_id.mutuals" : 5}), how does the lookup on the friends.friend_id work? Is this a O(log n) operation (where n is the number of friends)?

For schema 2, how would the query Doc2col.update({user_id : "…", "friends.id" : "friend_id"}, {"$set": {"friends.$.mutuals" : 5}) compare to that of the above?

696

asked Nov 30 '12 02:11

Nelson Shaw

1 Answers

doc1 is preferable if one's primary requirements is to present data to the ui in a nice manageable package. its simple to filter only the desired data using a projection {}, {friends.2 : 1}

doc2 is your strongest match since your use case does not care about the result Note that it doesn’t really matter what’s returned and indexing will speed up the fetch.

on top of that doc2 permits the much cleaner syntax

db.doc2.findOne({user_id: 1, friends.id : 2} )

versus

db.doc1.findOne({ $and : [{ user_id: 1 }, { "friends.2" : {$exists: true} }] })

on a final note, however, one can create a sparse index on doc1 (and use $exists) but your possibility of 100,000 friends -- each friend needed a sparse index -- makes that absurd. opposed to a reasonable number of entries say demographics gender [male,female], agegroups [0-10,11-16,25-30,..] or more impt things [gin, whisky, vodka, ... ]

answered Oct 23 '22 09:10

Gabe Rainbow

Related questions
                            
                                In JavaScript, why is [ ] preferred over new Array();?
                            
                                Get second to last value in array
                            
                                Convert IList to array in C#
                            
                                How to check if a particular character exists within a character array
                            
                                How to find duplicate values in a JavaScript array of objects, and output only unique values?
                            
                                malloc an array of struct pointers
                            
                                Remove all of the duplicate numbers in an array of numbers [duplicate]
                            
                                TableView search in Swift
                            
                                Difference between JavaScript Array every and some
                            
                                Is it possible to extend arrays in C#?
                            
                                Remove duplicate item from array Javascript [duplicate]
                            
                                indexOf in a string array
                            
                                How to sort String array by length using Arrays.sort()
                            
                                How to convert the null values to empty string in php array?
                            
                                How to remove empty array values ("") from an array?
                            
                                Best way to copy from one array to another [closed]
                            
                                Replace string in an array with PHP
                            
                                Efficient way to apply function to each 2D slice of 3D numpy array
                            
                                Why are most methods of System.Array static? [closed]
                            
                                Maximum sum of all subarrays of size k for each k=1..n

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

MongoDB embedded vs array sub document performance

Tags:

arrays

mongodb

nosql

Nelson Shaw

People also ask

1 Answers

Gabe Rainbow

Recent Activity

Donate For Us