Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I auto-generate complex identifiers and ensure uniqueness?

Tags:

sql

I've been handed a very specific business requirement, and I'm having trouble translating it to work within our database.

We have multiple departments, who all deal with what we call files. Each file has a unique identifier, but how this identifier is assigned depends on the department that the file is associated with.

Departments ABC, DEF and GHI need the system to provide them with the next identifier following a pattern... while departments JKL and MNO need to be able to specify their own identifiers.

This is further compounded by the fact that the generated file numbers are semi complex. Their file numbers follow the pattern:

[a-z]{3,4}\d{2}-\d{4} 

(A 3 or 4 letter prefix that corresponds to the department with two digits corresponding to the year, a dash followed by a four digit generated number - i.e. ABC13-0001)

The part before the dash is easy to generate... the file table has a mandatory foreign key reference to a department table that has the prefix column, and I can grab the year just as easily. it's the part after the dash that I can't seem to work out.

It's a four digit identifier. Each department needs the next generated ID to be as sequential to their department as possible (I say as sequential as possible, as I'm aware that even identity specifications can leave gaps). On top of that, we need to reset it back to 0001 each year. All of that, while ensuring that there are no duplicates, which is the most important part of all of this.

So, we only have one file table that is used by all of these departments. As such, to be able to handle JKL and MNO, I have the FileNumber field set to varchar(12) with a unique constraint. They can type in whatever they need, so long as it's unique. That just leaves me with how to generate the unique file numbers for the other departments.

My first instinct is to give the file table a surrogate identity primary key to make sure that, even if something goes wrong with the generated file number, that each record is guaranteed a unique identifier.

Then I would create a single row table that has two columns per department, one for a number and one for a date. The number would be the last used number for the 4 digit identifier suffix for a given department (as an int), and the date would be when it was assigned. Storing the date would let me check if the year has changed since the last id was pulled so that I can assign 1, instead of lastid + 1.

Then an insert trigger on the file table would generate the file id using:

  • The foreign key to the department table to get the department prefix
  • CONVERT(VARCHAR, YEAR( GETDATE() ) % 100) to pull the current two digit year
  • A select to the above described utility table to get the last id + 1 (or reseting to 1 if the current year differs from the year of the last updated date).

The trigger would finally update the utility table with the last used id and date.

In theory, I think that would work... and the unique constraint on the file id column would prevent an insert where the generated file id already exists. But it feels so brittle, and I can foresee that unique constraint being a double edged sword that would possibly prevent a department from creating a new file should the trigger fail to update the utility table. After all, if it doesn't, then the next time a file number is generated, it would try to use the same generated id, and fail. There must be a better way.

My other thought was to have a table per department with just an identity integer column and non-null date field with a default of (getdate())... and have the trigger on the file table insert a new row, and use that id. It would also be responsible for deleting all rows in the given department id table and reset the identity come the new year. It feels more secure to me, but then I have 5 utility tables (1 per department that auto-generate ids) with up to 9999 records, just to generate an id.

Am I missing something simple? Am I going about this the right way? I just can't help but think that there must be an easier way. Am I right to try to make the SQL server responsible for this, or should I try doing this in the desktop application that I will build on top of this database?

like image 504
Chronicide Avatar asked Nov 12 '22 19:11

Chronicide


1 Answers

So, I think it's possible you're overcomplicating this. It's also possible I'm misunderstanding the requirements. I don't believe you need separate tables or anything, you just need to check the existing IDs and increment them. SQLFiddle doesn't seem to be working right now, but here's a method I came up with in my local DB:

create table dept_ids (
  id varchar(12) primary key
  )

insert into dept_ids values('ABC13-0001')
insert into dept_ids values('DEF13-0001')
insert into dept_ids values('GHI13-0001')
insert into dept_ids values('JKL13-0001')
insert into dept_ids values('ABC13-0002')

declare 
    @dept varchar(4)
    , @year varchar(2)
    , @prefix varchar(8)
    , @new_seq int

set @dept = 'ABC'
select @year = right(cast(DATEPART(yy, getdate()) as varchar(4)), 2)
set @prefix = @dept + @year + '-'

select @new_seq = isnull(right(max(id), 4), 0) + 1 from dept_ids where id like @prefix + '%'

select new_id = @prefix + right(replicate('0', 4) + cast(@new_seq as varchar(4)), 4)

This handles the different departments of course, as well as the year scenario by using getdate() and an ISNULL check. For the departments that can enter their own sequence values, you can just add a null check for that parameter and skip generating this value if it's present.

If I'm oversimplifying, feel free to let me know and I'll adjust.

like image 112
Mac Avatar answered Dec 01 '22 00:12

Mac