I'm unable to find documentation about the algorithm that mongodb is using for collection or shard keys.
Can anyone help with this or post a reference?
If you are more interested in how indexing in general works check this presentation about the internals : http://www.mongodb.com/presentations/storage-engine-internals or this one http://www.mongodb.com/presentations/mongodbs-storage-engine-bit-bit
As an individual shard knows not much about the whole structure of the cluster, it utilizes the same indexing algorithm internally just there is a metadata layer which knows which datapart related to the specific shard.
There are some special cases, which are described in this docs : http://docs.mongodb.org/manual/core/indexes/ So which is not covered this way in the presentations above are the geospatial indexes and the special one which is the hashed index (DOCS). This one is also could be used as shard key and called hashed index and in this case the sharding is hash based sharding.check THIS and THIS
About the hashing algorithm which is used for this is: md5 used in this file: https://github.com/mongodb/mongo/blob/master/src/mongo/db/hasher.cpp
implemented here : https://github.com/mongodb/mongo/blob/master/src/mongo/util/md5.cpp
Currently works only for an individual field as shard key at least this could be read out from the comments in the https://github.com/mongodb/mongo/blob/master/src/mongo/db/index/hash_access_method.cpp source file.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With