Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the recommended approach towards multi-tenant databases in Cassandra?

I'm thinking of creating a multi-tenant app using Apache Cassandra.

I can think of three strategies:

  1. All tenants in the same keyspace using tenant-specific fields for security
  2. table per tenant in a single shared DB
  3. Keyspace per tenant

The voice in my head is suggesting that I go with option 3.

Thoughts and implications, anyone?

like image 815
Jagan Avatar asked Dec 17 '22 19:12

Jagan


1 Answers

There are several considerations that you need to take into account:

Option 1: In pure Cassandra this option will work only if access to database will be always through "proxy" - the API, for example, that will enforce filtering on tenant field. Otherwise, if you provide an CQL access, then everybody can read all data. In this case, you need also to create data model carefully, to have tenant as a part of composite partition key. DataStax Enterprise (DSE) has additional functionality called row-level access control (RLAC) that allows to set permissions on the table level.

Options 2 & 3: are quite similar, except that when you have a keyspace per tenant, then you have flexibility to setup different replication strategy - this could be useful to store customer's data in different data centers bound to different geographic regions. But in both cases there are limitations on the number of tables in the cluster - reasonable number of tables is around 200, with "hard stop" on more than 500. The reason - you need an additional resources, such as memory, to keep auxiliary data structures (bloom filter, etc.) for every table, and this will consume both heap & off-heap memory.

like image 184
Alex Ott Avatar answered May 10 '23 09:05

Alex Ott