Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create unique row ID in sharded databases?

In a non-sharded DB, I could just use auto-increment to generate a unique ID to reference a specific row.

I want to shard my DB, say into 12 shards. Now when I insert into a specific shard, the auto-increment ID is no longer unique.

Would like to hear anyone's experience in dealing with this problem.

like image 759
Continuation Avatar asked Apr 25 '09 12:04

Continuation


People also ask

What is difference between sharding and partitioning?

Sharding and partitioning are both about breaking up a large data set into smaller subsets. The difference is that sharding implies the data is spread across multiple computers while partitioning does not. Partitioning is about grouping subsets of data within a single database instance.

What is a shard key?

Shard Keys The “shard key” is used to distribute the MongoDB collection's documents across all the shards. The key consists of a single field or multiple fields in every document. The sharded key is immutable and cannot be changed after sharding. A sharded collection only contains a single shard key.

How many types of sharding are there?

While there are many different sharding methods, we will consider four main kinds: ranged/dynamic sharding, algorithmic/hashed sharding, entity/relationship-based sharding, and geography-based sharding.

Can we do sharding in relational databases?

Sharding, also known as horizontal partitioning, is a popular scale-out approach for relational databases. Amazon Relational Database Service (Amazon RDS) is a managed relational database service that provides great features to make sharding easy to use in the cloud.


2 Answers

A few approaches

1) Give each shard it's own ID, and use a composite key

2) Give each shard it's own ID and set ID ranges for each shard

3) Use a globally unique ID - GUID

like image 87
MrTelly Avatar answered Nov 15 '22 09:11

MrTelly


The two approaches I've used to this sort of problem:

  • GUID: Easy to implement, creates larger tables and indexes though.
  • ID Domain: I made that term up but basically it means dividing the 32 (or 64) bits of an integer type into two parts, the top part is represents a domain. The number of bits to use for the domain depends on how many domains you want to support verses the number of records you expect a single domain to introduce. In this approach you allocate a domain to each shard. The down side is DBs (that I know of) do not support this approach directly you need to code ID allocation yourself.
like image 35
AnthonyWJones Avatar answered Nov 15 '22 09:11

AnthonyWJones