We are building a web application in Java on top of PostgreSQL. It's quite large and successful, and it should be able to run for a few more years at least.
Unfortunately, we (well, I) made a grave mistake in the early stages of the design process: all the database IDs are integers, handed out from one shared sequence.
Java's max int is 2^31-1, so approximately 2 billions. The same goes for PostgreSQL's integer type. The system is currently eating up ~10k IDs every day, and the rate is going up as we gain new users.
One day, the IDs will run out and overflow.
We are looking for ways to fix the situation. Let's get the obvious one out of the way immediately: switching to Java's long
and Postgres' bigint
is a clean solution, but it's a ton of work. We need to postpone it as much as possible.
Some ideas we've had so far:
customer, timestamp
is a perfectly valid primary key.
long
/ bigint
with a new sequence.
Under these constraints, what other approaches will delay the ID depletion?
Switching to long is far from a clean solution. There's only one sensible option if you grow too large: UUIDs (yes, PostgreSQL comes with uuid data type).
At 128 bits its the size of 4 integers, but you don't want to go through whole application in few years and to all this again, do you? UUIDs will work when you grow too large and you need to shard your data. You won't be able to have a shared sequence then, that's why UUIDs make sense.
As a bonus you can even keep your unique property on each row.
Migration isn't that hard: adding a column with NULL in PostgreSQL is cheap, so you can first add a column and then do online migration in batches, where you update a few thousand records at a time, so you don't have downtime.
Then you can test the same code with both foreign keys. Does Java have anything similar to laboratory or scientist?
Will it be a ton of work? Yes, but that's obviously a good sign, if you have an application that's so popular.
I also hope you've learned a lesson with using the same sequence for all the tables. Honestly - I don't really see added value in that. If you want to know where an object fits, you can also name the primary keys differently (eg. room_id, reservation_id, etc.).
Since asking this question, I have found a nice way to fix a half of the problem - the database side. So, for the future generations, here is the way to do it.
Find all the DB columns of type integer
or integer[]
. Check the results manually and remove columns of type, for example, text[]
.
SELECT *
FROM information_schema.columns cls
JOIN information_schema.tables tbl ON cls.table_name = tbl.table_name
WHERE
cls.table_schema = '<my schema>'
AND cls.data_type = 'integer' OR cls.data_type = 'ARRAY'
AND tbl.table_type = 'BASE TABLE';
Prepare a data type change DDL for each of those columns:
ALTER TABLE <one of the tables found>
ALTER COLUMN <one of its integral columns>
TYPE bigint;
This works beautifully, except for the VIEW
s: they don't like me changing their return types. I need to re-create all of them - the sequence will be
VACUUM
and ANALYZE
on all the tables.bigint[]
cannot be cast to integer[]
.How to export/backup VIEW
s only?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With