Let's imagine that we're building GitHub and we have two tables: repos
and issues
. Every GitHub repo has a collection of issues, and so the issues
table has a foreign key of repo_id
.
Now, when you're browsing a GitHub repo's issues, you don't want to be exposed to the internal id
. Instead, you want something like a number
, which increments from 1..n
for only that repository. You want your first issue in your new repo to be numbered 1
, not whatever the next id
for an issue on GitHub is.
Of course, you need a way to increment, and you want to make sure that the number
is unique when scoped to the repo. And so you especially want to avoid then any race condition where the same number can be generated twice.
What's the most straightforward way of handling this? A trigger? Something else entirely?
I am using PostgreSQL but would prefer approaches that are vanilla SQL where possible, e.g. triggers. If there's a simpler Postgres approach, then that would also be useful.
Any code that demonstrate your approach would be extraordinarily useful. Thanks!
There can be only one AUTO_INCREMENT column per table, it must be indexed, and it cannot have a DEFAULT value. So you can indeed have an AUTO_INCREMENT column in a table that is not the primary key.
If you're looking to add auto increment to an existing table by changing an existing int column to IDENTITY , SQL Server will fight you. You'll have to either: Add a new column all together with new your auto-incremented primary key, or. Drop your old int column and then add a new IDENTITY right after.
To add a new AUTO_INCREMENT integer column named c : ALTER TABLE t2 ADD c INT UNSIGNED NOT NULL AUTO_INCREMENT, ADD PRIMARY KEY (c); We indexed c (as a PRIMARY KEY ) because AUTO_INCREMENT columns must be indexed, and we declare c as NOT NULL because primary key columns cannot be NULL .
A primary key is by no means required to use the auto_increment property - it just needs to be a unique, not-null, identifier, so the account number would do just fine.
I don't think there is a way to do this without the possibility of a race condition. This should minimize race conditions but not eliminate them. There may be better ways within specific database architectures. Assuming "REPOSITORY_ID" is provided by your application code:
insert into issues (repo_id,line_id) values (
REPOSITORY_ID,
coalesce((select max(line_id)+1 from issues where repo_id=REPOSITORY_ID),0)
);
This pulls the current highest line_id and increments it at the time of the insert. If there are no records, it defaults to 0. There is a small chance of a race if two inserts hit at the exact same time, but it seems unlikely. If you enforce uniqueness you can check for an error on insert and retry on failure.
Suppose you want to add a new issue
to a certain repo
, you could execute the following operations:
repo
you want to modify with a SELECT ... FOR UPDATE
. This will put a row-level lock on it and prevent other transactions that want to add a new issue
for that repo
to proceed concurrently;repo
in some way (for instance you could have a latest_issue
column in issue
, as in one of the answers, or you could perform a query to find it);issue
with the correct issue number;repo
to continue.So you could define a stored procedure in this way, and call it every time you want to insert a new issue
. Under the hypothesis that there are not too many concurrent transactions trying to insert new issues for the same repository, this would prevent race conditions and still operate with a reasonable efficiency.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With