Suppose I have a collection where I create a unique index on a field:
db.users.createIndex({username: 1}, {unique:true})
What happens if two documents with the same username are SIMULTANEOUSLY being inserted in the collection?
How does the database prevent the collision? I mean which one gets inserted and which results in an error?
Assuming the inserts are really SIMULTANEOUS there is no way for the database to know that two duplicates are being inserted, right?
So, what's really going on?
A unique index ensures that the values in the index key columns are unique. A unique constraint also guarantees that no duplicate values can be inserted into the column(s) on which the constraint is created. When a unique constraint is created a corresponding unique index is automatically created on the column(s).
In addition to enforcing the uniqueness of data values, a unique index can also be used to improve data retrieval performance during query processing.
In theory there is a slight difference in update performance as the engine needs to enforce uniqueness in a unique index, but in reality this is one going to be at most a few CPU cycles per row difference so will be unnoticeable.
Index: It is a schema object which is used to provide improved performance in the retrieval of rows from a table. Unique Index: Unique indexes guarantee that no two rows of a table have duplicate values in the key column (or columns).
Writes can not be applied simultaneously to the dataset. When a write is sent to a MongoDB instance, be it a shard or a standalone server, here is what happens
That being said, we can look at the actual values. 1 million writes / second seem a bit much for a single server (simply because the mass storage can't handle it), so we assume a sharded cluster with 10 shards, with a shard key which distributes the writes more or less evenly. As we have seen above, all operations are applied in RAM. With today's hardware, some 3.5 billion instructions/s can be processed, or 3.5 instructions per nanosecond. Let's assume getting and releasing a lock each take 35 instructions or 10 nanoseconds. So locking and unlocking for each of our 100k writes would take 20 nanoseconds, altogether 1/500 of a second.
That would leave 499/500 of a second or 998000000 nanoseconds for the other stuff MongoDB needs to do, which translates to a whopping 3.493 billion instructions.
The locks to prevent concurrent writes are far from being the limiting factor for write operations. Syncing the changes to the journal and the data files is usually the limiting factor, followed by to less RAM to keep the indices and working set in RAM.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With