Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PostgreSQL gapless sequences

I'm moving from MySql to Postgres, and I noticed that when you delete rows from MySql, the unique ids for those rows are re-used when you make new ones. With Postgres, if you create rows, and delete them, the unique ids are not used again.

Is there a reason for this behaviour in Postgres? Can I make it act more like MySql in this case?

like image 846
fatfrog Avatar asked Apr 02 '12 21:04

fatfrog


People also ask

What is Nextval in Postgres?

NEXTVAL is a function to get the next value from a sequence. Sequence is an object which returns ever-increasing numbers, different for each call, regardless of transactions etc. Each time you call NEXTVAL , you get a different number. This is mainly used to generate surrogate primary keys for you tables.

How do I use Currval in PostgreSQL?

currval(' sequence_name ') Returns the most recently returned value from nextval(' sequence_name '). This value is associated with a PostgreSQL session, and if the nextval() function has not yet been called in the connected session on the specified sequence sequence_name , there will be no "current" value returned.

What is the max value of sequence in Postgres?

The defaults are 1 and -2147483647 for ascending and descending sequences, respectively. Use the optional clause MAXVALUE maxvalue to determine the maximum value for the sequence. The defaults are 2147483647 and -1 for ascending and descending sequences, respectively.

How does sequence work in PostgreSQL?

A sequence in PostgreSQL is a user-defined schema-bound object that yields a sequence of integers based on a specified specification. The CREATE SEQUENCE statement is used to create sequences in PostgreSQL.


1 Answers

Sequences have gaps to permit concurrent inserts. Attempting to avoid gaps or to re-use deleted IDs creates horrible performance problems. See the PostgreSQL wiki FAQ.

PostgreSQL SEQUENCEs are used to allocate IDs. These only ever increase, and they're exempt from the usual transaction rollback rules to permit multiple transactions to grab new IDs at the same time. This means that if a transaction rolls back, those IDs are "thrown away"; there's no list of "free" IDs kept, just the current ID counter. Sequences are also usually incremented if the database shuts down uncleanly.

Synthetic keys (IDs) are meaningless anyway. Their order is not significant, their only property of significance is uniqueness. You can't meaningfully measure how "far apart" two IDs are, nor can you meaningfully say if one is greater or less than another. All you can do is say "equal" or "not equal". Anything else is unsafe. You shouldn't care about gaps.

If you need a gapless sequence that re-uses deleted IDs, you can have one, you just have to give up a huge amount of performance for it - in particular, you cannot have any concurrency on INSERTs at all, because you have to scan the table for the lowest free ID, locking the table for write so no other transaction can claim the same ID. Try searching for "postgresql gapless sequence".

The simplest approach is to use a counter table and a function that gets the next ID. Here's a generalized version that uses a counter table to generate consecutive gapless IDs; it doesn't re-use IDs, though.

CREATE TABLE thetable_id_counter ( last_id integer not null ); INSERT INTO thetable_id_counter VALUES (0);  CREATE OR REPLACE FUNCTION get_next_id(countertable regclass, countercolumn text) RETURNS integer AS $$ DECLARE     next_value integer; BEGIN     EXECUTE format('UPDATE %s SET %I = %I + 1 RETURNING %I', countertable, countercolumn, countercolumn, countercolumn) INTO next_value;     RETURN next_value; END; $$ LANGUAGE plpgsql;  COMMENT ON get_next_id(countername regclass) IS 'Increment and return value from integer column $2 in table $1'; 

Usage:

INSERT INTO dummy(id, blah)  VALUES ( get_next_id('thetable_id_counter','last_id'), 42 ); 

Note that when one open transaction has obtained an ID, all other transactions that try to call get_next_id will block until the 1st transaction commits or rolls back. This is unavoidable and for gapless IDs and is by design.

If you want to store multiple counters for different purposes in a table, just add a parameter to the above function, add a column to the counter table, and add a WHERE clause to the UPDATE that matches the parameter to the added column. That way you can have multiple independently-locked counter rows. Do not just add extra columns for new counters.

This function does not re-use deleted IDs, it just avoids introducing gaps.

To re-use IDs I advise ... not re-using IDs.

If you really must, you can do so by adding an ON INSERT OR UPDATE OR DELETE trigger on the table of interest that adds deleted IDs to a free-list side table, and removes them from the free-list table when they're INSERTed. Treat an UPDATE as a DELETE followed by an INSERT. Now modify the ID generation function above so that it does a SELECT free_id INTO next_value FROM free_ids FOR UPDATE LIMIT 1 and if found, DELETEs that row. IF NOT FOUND gets a new ID from the generator table as normal. Here's an untested extension of the prior function to support re-use:

CREATE OR REPLACE FUNCTION get_next_id_reuse(countertable regclass, countercolumn text, freelisttable regclass, freelistcolumn text) RETURNS integer AS $$ DECLARE     next_value integer; BEGIN     EXECUTE format('SELECT %I FROM %s FOR UPDATE LIMIT 1', freelistcolumn, freelisttable) INTO next_value;     IF next_value IS NOT NULL THEN         EXECUTE format('DELETE FROM %s WHERE %I = %L', freelisttable, freelistcolumn, next_value);     ELSE         EXECUTE format('UPDATE %s SET %I = %I + 1 RETURNING %I', countertable, countercolumn, countercolumn, countercolumn) INTO next_value;     END IF;     RETURN next_value; END; $$ LANGUAGE plpgsql; 
like image 176
Craig Ringer Avatar answered Sep 19 '22 07:09

Craig Ringer