let say, the document is
{
x:Number
}
and I have 3 shards.
Instead of autosharding, can I define specifically shard1 only contains data x<0, shard2 only contains data 0 =< x =< 1000 , and shard 3 is 1000
The Databases section lists information on the database(s). For each database, the section displays the name, whether the database has sharding enabled, and the primary shard for the database. The Sharded Collection section provides information on the sharding details for sharded collection(s).
Range-based sharding involves dividing data into contiguous ranges determined by the shard key values. In this model, documents with "close" shard key values are likely to be in the same chunk or shard. This allows for efficient queries where reads target documents within a contiguous range.
The shard key is either a single indexed field or multiple fields covered by a compound index that determines the distribution of the collection's documents among the cluster's shards.
MongoDB uses sharding to support deployments with very large data sets and high throughput operations. Database systems with large data sets or high throughput applications can challenge the capacity of a single server.
You can. It's possible to pre-split chunks manually, it's described here: http://www.mongodb.org/display/DOCS/Splitting+Chunks
Think carefully about how you split your chunks. If you do it badly you can get lots of performance problems, but if you know enough about your keys you can gain a lot.
If you do it you probably want to turn off the balancer:
> use config
> db.settings.update({_id: "balancer"}, {$set: {stopped: true}}, true);
(this is described here: http://www.mongodb.org/display/DOCS/Sharding+Administration)
This is an example of how you might do it. Depending on exactly what you want to do you will have to modify it (I assume your shard key is not named x
, for example, and your range isn't -1000 to 2000).
> use admin
> db.runCommand({split: "my_db.my_coll", middle: {x: 0}})
> db.runCommand({split: "my_db.my_coll", middle: {x: 1000}})
> db.runCommand({movechunk: "my_db.my_coll", find: {x: -1}, to: "shard_1_name"})
> db.runCommand({movechunk: "my_db.my_coll", find: {x: 0}, to: "shard_2_name"})
> db.runCommand({movechunk: "my_db.my_coll", find: {x: 1000}, to: "shard_3_name"})
The split
commands create the chunks. Each command splits the chunk containing the middle value into two, so the first splits the chunk containing min_value -> max_value
into min_value -> 0
and 0 -> max_value
. Then the second command splits the chunk containing 1000, the second chunk created by the previous command, into two new chunks. After that command you have three chunks:
min_value -> 0
0 -> 1000
1000 -> max_value
The three following commands moves these chunks to separate shards. The docs say that the command will move the chunk containing the value in find
, so I picked three values I know are in different chunks and used these (there is a symbol in BSON for min_key
and max_key
, but I'm not sure how to use it properly in this context).
Read this page too http://www.mongodb.org/display/DOCS/Moving+Chunks
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With