Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

mongodb shard key hash algorithm

I'm unable to find documentation about the algorithm that mongodb is using for collection or shard keys.

Can anyone help with this or post a reference?

like image 546
user2152443 Avatar asked Oct 21 '22 15:10

user2152443


1 Answers

If you are more interested in how indexing in general works check this presentation about the internals : http://www.mongodb.com/presentations/storage-engine-internals or this one http://www.mongodb.com/presentations/mongodbs-storage-engine-bit-bit

As an individual shard knows not much about the whole structure of the cluster, it utilizes the same indexing algorithm internally just there is a metadata layer which knows which datapart related to the specific shard.

There are some special cases, which are described in this docs : http://docs.mongodb.org/manual/core/indexes/ So which is not covered this way in the presentations above are the geospatial indexes and the special one which is the hashed index (DOCS). This one is also could be used as shard key and called hashed index and in this case the sharding is hash based sharding.check THIS and THIS

About the hashing algorithm which is used for this is: md5 used in this file: https://github.com/mongodb/mongo/blob/master/src/mongo/db/hasher.cpp

implemented here : https://github.com/mongodb/mongo/blob/master/src/mongo/util/md5.cpp

Currently works only for an individual field as shard key at least this could be read out from the comments in the https://github.com/mongodb/mongo/blob/master/src/mongo/db/index/hash_access_method.cpp source file.

like image 150
attish Avatar answered Nov 02 '22 12:11

attish