Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best approach for multi-tenant primary keys

I have a database used by several clients. I don't really want surrogate incremental key values to bleed between clients. I want the numbering to start from 1 and be client specific.

I'll use a two-part composite key of the tenant_id as well as an incremental id.

What is the best way to create an incremental key per tenant?

I am using SQL Server Azure. I'm concerned about locking tables, duplicate keys, etc. I'd typically set the primary key to IDENTITY and move on.

Thanks

like image 322
Paul Deen Avatar asked Sep 16 '12 20:09

Paul Deen


People also ask

Which database is best for multi tenants?

SQL Database supports row-level security, which can enforce that data returned from a query be scoped to a single tenant. Processing: A multi-tenant database shares compute and storage resources across all its tenants.

Can you have 3 primary keys?

Primary keys must contain UNIQUE values, and cannot contain NULL values. A table can have only ONE primary key; and in the table, this primary key can consist of single or multiple columns (fields).

What are the three multi-tenancy models?

There are three multi-tenancy models: Database, Schema, and Table. In Database multi-tenancy, the application connects to a database and gets data while the tenancy logic is delegated to the ops layer.

How do you implement multi-tenancy?

We can implement multi-tenancy using any of the following approaches: Database per Tenant: Each Tenant has its own database and is isolated from other tenants. Shared Database, Shared Schema: All Tenants share a database and tables. Every table has a Column with the Tenant Identifier, that shows the owner of the row.


2 Answers

Are you planning on using SQL Azure Federations in the future? If so, the current version of SQL Azure Federations does not support the use of IDENTITY as part of a clustered index. See this What alternatives exist to using guid as clustered index on tables in SQL Azure (Federations) for more details.

If you haven't looked at Federations yet, you might want to check it out as it provides an interesting way to both shard the database and for tenant isolation within the database.

Depending upon your end goal, using Federations you might be able to use a GUID as the primary clustered index on the table and also use an incremental INT IDENTITY field on the table. This INT IDENTITY field could be shown to end-users. If you are federating on the TenantID each "Tenant table" effectively becomes a silo (as I understand it at least) so the use of IDENTITY on a field within that table would effectively be an ever increasing auto generated value which increments within a given Tenant.

When \ if data is merged together (combining data from multiple Tenants) you would wind up with collisions on this INT IDENTITY field (hence why IDENTITY isn't supported as a primary key in federations) but as long as you aren't using this field as a unique identifier within the system at large you should be ok.

like image 135
Tim Lentine Avatar answered Oct 02 '22 14:10

Tim Lentine


If you're looking to duplicate the convenience of having an automatically assigned unique INT key upon insert, you could add an INSTEAD OF INSERT trigger that uses MAX of the existing column +1 to determine the next value.

If the column with the identity value is the first key in an index, the MAX query will be a simple index seek, very efficient.

Transactions will ensure that unique values are assigned but this approach will have different locking semantics than the standard identity column. IIRC, SQL Server can allocate a different identity value for each transaction that requests it in parallel and if a transaction is rolled back, the value(s) allocated to it are discarded. The MAX approach would only allow one transaction to insert rows into the table at a time.

A related approach could be to have a dedicated key value table keyed by the table name, tenant ID and current identity value. It would require the same INSTEAD OF INSERT trigger and more boilerplate to query and keep that key table updated. It wouldn't improve parallel operations though; the lock would just be on a different table's record.

One possibility to fix the locking bottleneck would be to include the current SPID in the key's value (now the identity key is a combination of sequential int and whatever SPID happened to allocate it and not simply sequential), use the dedicated identity value table and insert records there per SPID as necessary; the identity table PK would be (table name, tenant, SPID) and have a non-key column with the current sequential value. That way, each SPID would have its own dynamically allocated identity pool and would only ever have its own SPID specific records locked.

Another downside is maintaining triggers that have to be updated whenever you change the columns in any of the special identity tables.

like image 25
Chris Smith Avatar answered Oct 02 '22 12:10

Chris Smith