Generating cryptographic secure IDs instead of sequential identity / auto increment

I've been having this dilemma for a while and couldn't find any hints to it, although it seems that someone outha have done it already.

What I need is to replace sequential AUTO_INCREMENT (or equivalent) primary keys with criptographically secure (i.e. non-consecutive!) ids, but at the same time I want to keep the performance advantage of sequential PKs: guaranteed unused next ID, clusterability, etc.

A simple approach would seem to implement a cryptographic pseudo-random permutation generator to uniquely map the 2^N space to 2^N without collisions and with an initialisation vector (IV).

While this could be implemented externally, this does need to store and atomically access state (the permutation position or last id), which means implementing externally would be grossly inefficient (it's the equivalent of running a subsequent UPDATE table SET crypto_id = FN_CRYPTO(autoincrement_id) WHERE autoincrement_id=LAST_INSERT_ID() for every INSERT).

Do you know of any such implementation as described above in a database in commercial use?

Should you use auto increment in SQL?

Auto-increment should be used as a unique key when no unique key already exists about the items you are modelling. So for Elements you could use the Atomic Number or Books the ISBN number.

Is auto increment primary key good?

The advantages to using numeric, auto incremented primary keys are numerous, but the most impactful benefits are faster speed when performing queries and data-independence when searching through thousands of records which might contain frequently altered data elsewhere in the table.

What is the purpose of Auto_increment?

Auto-increment allows a unique number to be generated automatically when a new record is inserted into a table. Often this is the primary key field that we would like to be created automatically every time a new record is inserted.

What is the difference between sequence and auto increment in SQL?

In SQL Server, you mark a column as an auto-increment column and SQL Server automatically generates new values for the column when you insert a new row. In Oracle, you create a sequence to generate new values for a column in your table, but there is no direct link between the sequence and the table or column.

While this could be implemented externally, this does need to store and atomically access state (the permutation position or last id), which means implementing externally would be grossly inefficient (it's the equivalent of running a subsequent
 UPDATE table SET crypto_id = FN_CRYPTO(autoincrement_id) 
 WHERE autoincrement_id=LAST_INSERT_ID()

You could use generated/virtual column to avoid running proposed UPDATE for every insert:

-- pseudocode
CREATE TABLE tab(
   autoincrement_id INT AUTO_INCREMENT,
   crypto_id <type> GENERATED ALWAYS AS (FN_CRYPTO(autoincrement_id)) STORED
);

-- SQL Server example, SHA function is an example and should be replaced
CREATE TABLE tab(
 autoincrement_id INT IDENTITY(1,1),
 crypto_id AS (HASHBYTES('SHA2_256',CAST(autoincrement_id AS NVARCHAR(MAX))))     PERSISTED
);

db<>fiddle demo

More info:

SQL Server computed columns
Computed / calculated / virtual / derived columns in PostgreSQL
Column Depending on other column

EDIT by Dinu

If you use SHA, don't forget to concatenate a secret salt to the autoincrement_id; alternately, you could use i.e. AES128 to encrypt the autoincrement_id with a secret password and IV.

Also worth noting: any DB user with access to the table DDL will have access to your secret salt/key/iv. If this is of concern to you, you can use a parameterized stored procedure i.e. FN_CRYPTO(id,key,iv) instead and send them along with every insert.

To retrieve the crypto_id on the app-side without needing a subsequent query, you would need to replicate the encryption function app-side to run on the returned autoincrement_id. Note: if using autoincrement_id as byte array for AES128, be very careful about endianness, it may differ DB and app-side. The only alternative is to use the OUTPUT syntax of mssql, but that is specific to mssql and it requires running the ExecuteScalar API instead of ExecuteNonQuery.

Generating cryptographic secure IDs instead of sequential identity / auto increment

Tags:

sql

database

cryptography

Dinu

People also ask

1 Answers

Lukasz Szozda

Recent Activity

Donate For Us

Generating cryptographic secure IDs instead of sequential identity / auto increment

Tags:

sql

database

cryptography

Dinu

People also ask

1 Answers

Lukasz Szozda

Related questions

Recent Activity

Donate For Us