Mongo sharding fails to split large collection between shards

Tags:

sharding

I'm having problems with what seems to be a simple sharding setup in mongo.

I have two shards, a single mongos instance, and a single config server set up like this:

Machine A - 10.0.44.16 - config server, mongos
Machine B - 10.0.44.10 - shard 1
Machine C - 10.0.44.11 - shard 2

I have a collection called 'Seeds' that has a shard key 'SeedType' which is a field that is present on every document in the collection, and contains one of four values (take a look at the sharding status below). Two of the values have significantly more entries than the other two (two of them have 784,000 records each, and two have about 5,000).

The behavior I'm expecting to see is that records in the 'Seeds' collection with InventoryPOS will end up on one shard, and the ones with InventoryOnHand will end up on the other.

However, it seems that all records for both the two larger shard keys end up on the primary shard.

Here's my sharding status text (other collections removed for clarity):

--- Sharding Status ---
  sharding version: { "_id" : 1, "version" : 3 }
  shards:
      { "_id" : "shard0000", "host" : "10.44.0.11:27019" }
      { "_id" : "shard0001", "host" : "10.44.0.10:27017" }
  databases:
        { "_id" : "admin", "partitioned" : false, "primary" : "config" }
        { "_id" : "TimMulti", "partitioned" : true, "primary" : "shard0001" }
                TimMulti.Seeds chunks:
                        { "SeedType" : { $minKey : 1 } } -->> { "SeedType" : "PBI.AnalyticsServer.KPI" } on : shard0000 { "t" : 2000, "i" : 0 }
                        { "SeedType" : "PBI.AnalyticsServer.KPI" } -->> { "SeedType" : "PBI.Retail.InventoryOnHand" } on : shard0001 { "t" : 2000, "i" : 7 }
                        { "SeedType" : "PBI.Retail.InventoryOnHand" } -->> { "SeedType" : "PBI.Retail.InventoryPOS" } on : shard0001 { "t" : 2000, "i" : 8 }
                        { "SeedType" : "PBI.Retail.InventoryPOS" } -->> { "SeedType" : "PBI.Retail.SKU" } on : shard0001 { "t" : 2000, "i" : 9 }
                        { "SeedType" : "PBI.Retail.SKU" } -->> { "SeedType" : { $maxKey : 1 } } on : shard0001 { "t" : 2000, "i" : 10 }

Am I doing anything wrong?

Semi-unrelated question:

What is the best way to atomically transfer an object from one collection to another without blocking the entire mongo service?

Thanks in advance, -Tim

597

asked Sep 10 '10 00:09

Tim

1 Answers

Sharding really isn't meant to be used this way. You should choose a shard key with some variation (or make a compound shard key) so that MongoDB can make reasonable-size chunks. One of the points of sharding is that your application doesn't have to know where your data is.

If you want to manually shard, you should do that: start unlinked MongoDB servers and route things yourself from the client side.

Finally, if you're really dedicated to this setup, you could migrate the chunk yourself (there's a moveChunk command).

The balancer moves chunks based on how much is mapped in memory (run serverStatus and look at the "mapped" field). It can take a while, MongoDB doesn't want your data flying all over the place in production, so it's pretty conservative.

Semi-unrelated answer: you can't do it atomically with sharding (eval isn't atomic across multiple servers). You'll have to do a findOne, insert, remove.

152

answered Oct 23 '22 04:10

kristina

Related questions
                            
                                LINQ-to-MongoDB - Return list only when values between 2 columns match
                            
                                Updating multiple documents with different values
                            
                                mongodb dump multiple collections or exclude collections version 2.6
                            
                                ATOMICally update multiple documents AND return them
                            
                                MongoDB Collection update: initialize a document with default values
                            
                                How to get ranking position of a mongoDB collection?
                            
                                Parse Server - Can't Access Images
                            
                                Mongodb aggregation project after lookup
                            
                                Can't use "delete" operator on mongoose query results
                            
                                How to run background task in node.js api after sending response?
                            
                                Mongoose sum fields from populated documents
                            
                                sync data from mongoDB to firebase and vice-versa
                            
                                MongoDB: Aggregation using $cond with $regex
                            
                                MongoDB equivalent of SQL expression '1=1' in Java
                            
                                Executing JavaScript function with mongo shell has no output
                            
                                Get all items where subDocument.value in listOfStrings
                            
                                MongoDB sorting by secondary lookup table
                            
                                Where is the "No transaction in context" exception coming from?
                            
                                Next.js API and DB Connections Issue
                            
                                MongoDB ChangeStream performance

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Mongo sharding fails to split large collection between shards

Tags:

mongodb

sharding

Tim

People also ask

1 Answers

kristina

Recent Activity

Donate For Us