PostgreSQL gapless sequences

Tags:

I'm moving from MySql to Postgres, and I noticed that when you delete rows from MySql, the unique ids for those rows are re-used when you make new ones. With Postgres, if you create rows, and delete them, the unique ids are not used again.

Is there a reason for this behaviour in Postgres? Can I make it act more like MySql in this case?

846

asked Apr 02 '12 21:04

fatfrog

1 Answers

Sequences have gaps to permit concurrent inserts. Attempting to avoid gaps or to re-use deleted IDs creates horrible performance problems. See the PostgreSQL wiki FAQ.

PostgreSQL SEQUENCEs are used to allocate IDs. These only ever increase, and they're exempt from the usual transaction rollback rules to permit multiple transactions to grab new IDs at the same time. This means that if a transaction rolls back, those IDs are "thrown away"; there's no list of "free" IDs kept, just the current ID counter. Sequences are also usually incremented if the database shuts down uncleanly.

Synthetic keys (IDs) are meaningless anyway. Their order is not significant, their only property of significance is uniqueness. You can't meaningfully measure how "far apart" two IDs are, nor can you meaningfully say if one is greater or less than another. All you can do is say "equal" or "not equal". Anything else is unsafe. You shouldn't care about gaps.

If you need a gapless sequence that re-uses deleted IDs, you can have one, you just have to give up a huge amount of performance for it - in particular, you cannot have any concurrency on INSERTs at all, because you have to scan the table for the lowest free ID, locking the table for write so no other transaction can claim the same ID. Try searching for "postgresql gapless sequence".

The simplest approach is to use a counter table and a function that gets the next ID. Here's a generalized version that uses a counter table to generate consecutive gapless IDs; it doesn't re-use IDs, though.

CREATE TABLE thetable_id_counter ( last_id integer not null ); INSERT INTO thetable_id_counter VALUES (0);  CREATE OR REPLACE FUNCTION get_next_id(countertable regclass, countercolumn text) RETURNS integer AS $$ DECLARE     next_value integer; BEGIN     EXECUTE format('UPDATE %s SET %I = %I + 1 RETURNING %I', countertable, countercolumn, countercolumn, countercolumn) INTO next_value;     RETURN next_value; END; $$ LANGUAGE plpgsql;  COMMENT ON get_next_id(countername regclass) IS 'Increment and return value from integer column $2 in table $1';

Usage:

INSERT INTO dummy(id, blah)  VALUES ( get_next_id('thetable_id_counter','last_id'), 42 );

Note that when one open transaction has obtained an ID, all other transactions that try to call get_next_id will block until the 1st transaction commits or rolls back. This is unavoidable and for gapless IDs and is by design.

If you want to store multiple counters for different purposes in a table, just add a parameter to the above function, add a column to the counter table, and add a WHERE clause to the UPDATE that matches the parameter to the added column. That way you can have multiple independently-locked counter rows. Do not just add extra columns for new counters.

This function does not re-use deleted IDs, it just avoids introducing gaps.

To re-use IDs I advise ... not re-using IDs.

If you really must, you can do so by adding an ON INSERT OR UPDATE OR DELETE trigger on the table of interest that adds deleted IDs to a free-list side table, and removes them from the free-list table when they're INSERTed. Treat an UPDATE as a DELETE followed by an INSERT. Now modify the ID generation function above so that it does a SELECT free_id INTO next_value FROM free_ids FOR UPDATE LIMIT 1 and if found, DELETEs that row. IF NOT FOUND gets a new ID from the generator table as normal. Here's an untested extension of the prior function to support re-use:

CREATE OR REPLACE FUNCTION get_next_id_reuse(countertable regclass, countercolumn text, freelisttable regclass, freelistcolumn text) RETURNS integer AS $$ DECLARE     next_value integer; BEGIN     EXECUTE format('SELECT %I FROM %s FOR UPDATE LIMIT 1', freelistcolumn, freelisttable) INTO next_value;     IF next_value IS NOT NULL THEN         EXECUTE format('DELETE FROM %s WHERE %I = %L', freelisttable, freelistcolumn, next_value);     ELSE         EXECUTE format('UPDATE %s SET %I = %I + 1 RETURNING %I', countertable, countercolumn, countercolumn, countercolumn) INTO next_value;     END IF;     RETURN next_value; END; $$ LANGUAGE plpgsql;

176

answered Sep 19 '22 07:09

Craig Ringer

Related questions
                            
                                Manually set updated_at in Rails
                            
                                Why does Heroku use Postgresql?
                            
                                How can I simulate the browser back button in Capybara?
                            
                                Serving precompiled assets with nginx
                            
                                Is there a way to send key presses to Webkit using Capybara?
                            
                                How do I access the Rack environment from within Rails?
                            
                                How do I check if a variable is defined in rails?
                            
                                What is the difference between null_session and reset_session in Rails 4?
                            
                                validates associated with model's error message
                            
                                Rails 4.0 Strong Parameters nested attributes with a key that points to a hash
                            
                                Graph Databases - betting the company on it?
                            
                                Rails: about yield
                            
                                Comparing dates in rails
                            
                                Profile model for Devise users?
                            
                                How to tell if already within a database transaction in ruby on rails?
                            
                                Getting Rails URL helpers to automatically output https urls
                            
                                In Ruby unit tests, how to assert that a string contains certain substring?
                            
                                ruby/rails: How to determine if module is included?
                            
                                How to create forms with React and Rails?
                            
                                Can nginx be used as a reverse proxy for a backend websocket server?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

PostgreSQL gapless sequences

Tags:

postgresql

ruby-on-rails

sequence

fatfrog

People also ask

1 Answers

Craig Ringer

Recent Activity

Donate For Us