Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are sequences not updated when COPY is performed in PostgreSQL?

I'm inserting bulk records using COPY statement in PostgreSQL. What I realize is, the sequence IDs are not getting updated and when I try to insert a record later, it throws duplicate sequence ID. Should I manually update the sequence number to get the number of records after performing COPY? Isn't there a solution while performing COPY, just increment the sequence variable, that is, the primary key field of the table? Please clarify me on this. Thanks in advance!

For instance, if I insert 200 records, COPY does good and my table shows all the records. When I manually insert a record later, it says duplicate sequence ID error. It very well implies that it didn’t increment the sequence ids during COPYing as work fine during normal INSERTing. Instead of instructing the sequence id to set the max number of records, won’t there be any mechanism to educate the COPY command to increment the sequence IDs during its bulk COPYing option?

like image 798
siva Avatar asked Feb 01 '12 06:02

siva


People also ask

How do you refresh a sequence in PostgreSQL?

pg_get_serial_sequence can be used to avoid any incorrect assumptions about the sequence name. This resets the sequence in one shot: SELECT pg_catalog. setval(pg_get_serial_sequence('table_name', 'id'), (SELECT MAX(id) FROM table_name)+1);

What does the COPY command in PostgreSQL do?

COPY moves data between PostgreSQL tables and standard file-system files. COPY TO copies the contents of a table to a file, while COPY FROM copies data from a file to a table (appending the data to whatever is in the table already). COPY TO can also copy the results of a SELECT query.

Does Postgres COPY overwrite?

If you COPY data into a table already containing data, the new data will be appended. If you COPY TO a file already containing data, the existing data will be overwritten.

How does sequence work in PostgreSQL?

A sequence in PostgreSQL is a user-defined schema-bound object that yields a sequence of integers based on a specified specification. The CREATE SEQUENCE statement is used to create sequences in PostgreSQL. Now let's analyze the above syntax: First, set the name of the sequence after the CREATE SEQUENCE clause.


2 Answers

You ask:

Should I manually update the sequence number to get the number of records after performing COPY?

Yes, you should, as documented here:

Update the sequence value after a COPY FROM:

| BEGIN;
| COPY distributors FROM 'input_file';
| SELECT setval('serial', max(id)) FROM distributors;
| END;

You write:

it didn’t increment the sequence ids during COPYing as work fine during normal INSERTing

But that is not so! :) When you perform a normal INSERT, typically you do not specify an explicit value for the SEQUENCE-backed primary key. If you did, you would run in to the same problems as you are having now:

postgres=> create table uh_oh (id serial not null primary key, data char(1));
NOTICE:  CREATE TABLE will create implicit sequence "uh_oh_id_seq" for serial column "uh_oh.id"
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "uh_oh_pkey" for table "uh_oh"
CREATE TABLE
postgres=> insert into uh_oh (id, data) values (1, 'x');
INSERT 0 1
postgres=> insert into uh_oh (data) values ('a');
ERROR:  duplicate key value violates unique constraint "uh_oh_pkey"
DETAIL:  Key (id)=(1) already exists.

Your COPY command, of course, is supplying an explicit id value, just like the example INSERT above.

like image 104
pilcrow Avatar answered Oct 30 '22 05:10

pilcrow


I realize that this is a bit old but maybe someone might still be looking for the answer.

As other said COPY works in a similar way as INSERT, so for inserting into a table that has a sequence, you simply don't mention the sequence field at all and it is taken care of for you. For COPY it works in the same exact way. But doesn't it COPY require ALL fields in the table to be present in the text file? The correct answer is NO, it doesn't, but it is the default behavior.

To COPY and leave the sequence out do the following:

COPY $YOURSCHEMA.$YOURTABLE(col1,col2,col3,col4) FROM '$your_input_file' DELIMITER ',' CSV HEADER;

No need to manually update the schema afterwards, it works as intended and in my testing is just about as fast.

like image 24
Phobos Avatar answered Oct 30 '22 04:10

Phobos