Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sharding at application level

Tags:

nginx

sharding

I am designing a multi-tenant system and am considering sharding by tenant at the application layer level instead of database.

Hypothetically, the way this should work is that for incoming request a router process has a global collection of tenants containing primary attributes to determine the tenant for this request as well as the virtual shard id. This virtual shard id is further mapped to an actual shard.

The actual shard contains both the code for application as well as whole data for this tenant. These shards would be LNMP (Linux, Nginx, MySQL/MongoDB, PHP) servers.

The router process should act as proxy. It should be able to run some code to determine the target shard for incoming request based on the collection stored in some local db or files. To be able to scale this better, i am considering making the shards themselves act as routers also so that they can run a reverse proxy that will forward the request to appropriate shard. Maybe the nginx instance running on shard can also act as that reverse proxy. But how will it execute the application logic needed to match up the request with the appropriate shard.

I will appreciate any ideas and suggestions for this router implementation.

Thanks

like image 407
msingla Avatar asked Mar 31 '11 18:03

msingla


People also ask

What is application level sharding?

When an application stores and retrieves data, the sharding logic directs the application to the appropriate shard. This sharding logic can be implemented as part of the data access code in the application, or it could be implemented by the data storage system if it transparently supports sharding.

When should sharding be used?

Sharding is a method of splitting and storing a single logical dataset in multiple databases. By distributing the data among multiple machines, a cluster of database systems can store larger dataset and handle additional requests. Sharding is necessary if a dataset is too large to be stored in a single database.

Where is sharding used?

Sharding is a method for distributing a single dataset across multiple databases, which can then be stored on multiple machines. This allows for larger datasets to be split into smaller chunks and stored in multiple data nodes, increasing the total storage capacity of the system.


1 Answers

Another option would be to use a product such as dbShards. dbShards is the only sharding product that shards at the application level. This way you can use any RDMS (Postgres, MySQL, etc.) and still be able to shard your database without having to put some kind of proxy in-between. A lot of the other sharding products rely on a proxy to point the transactions to the correct shard, but dbShards knows where to go without having to "ask" anyone else.

Great product. dbshards

like image 191
chantheman Avatar answered Nov 15 '22 13:11

chantheman