I am new to MongoDB and I have very basic knowledge of its concepts of sharding. However I was wondering if it is possible to control the split of data yourself? For example a part of the records would be stored on one specific shard? This will be used together with a rails app.
You can turn off the balancer to stop auto balancing:
sh.setBalancerState(false)
If you know the range of the key you are splitting on you could also presplit your data ranges to the desired servers see PreSplitting example. The management of the shard would be done via the javascript shell and not via your rails application.
You should take care that no shard gets more load (becomes hot) and that is why there is auto balancing by default, using monitoring like the free MMS service will help you monitor that.
The decision to shard is a complex decision and one that you should put a lot of thought into.
There's a lot to learn about sharding, and much of it is non-obvious. I'd suggest reviewing the information at the following links:
In the context of a shard cluster, a chunk is a contiguous range of shard key values assigned to a particular shard. By default, chunks are 64 megabytes (unless modified as per above). When they grow beyond the configured chunk size, a mongos splits the chunk into two chunks. MongoDB chunks are logical and the data within them is NOT physically located together.
As I've mentioned the balancer moves the chunks around, however, you can do this manually. The balancer will take the decision to re-balance and request a chunk migration if there is a large enough difference ( minumum of 8) between the number of chunks on each shard. The actual moving of the chunks is co-ordinated between the "From" and "To" shard and when this is finished, the original chunks are removed from the "From" shard and the config servers are informed.
Quite a lot of people also pre-split, which helps with their migration. See here for more information.
In order to see documents split among the two shards, you'll need to insert enough documents in order to fill up several chunks on the first shard. If you haven't changed the default chunk size, you'd need to insert a minimum of 512MB of data in order to see data migrated to a second chunk. It's often a good idea to to test this and you can do this by setting your chunk size to 1MB and inserting 10MB of data. Here is an example of how to test this.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With