I'm inserting bulk records using COPY
statement in PostgreSQL. What I realize is, the sequence IDs are not getting updated and when I try to insert a record later, it throws duplicate sequence ID. Should I manually update the sequence number to get the number of records after performing COPY
? Isn't there a solution while performing COPY
, just increment the sequence variable, that is, the primary key field of the table? Please clarify me on this. Thanks in advance!
For instance, if I insert 200 records, COPY
does good and my table shows all the records. When I manually insert a record later, it says duplicate sequence ID error
. It very well implies that it didn’t increment the sequence ids during COPYing as work fine during normal INSERTing. Instead of instructing the sequence id to set the max number of records, won’t there be any mechanism to educate the COPY
command to increment the sequence IDs during its bulk COPYing option?
pg_get_serial_sequence can be used to avoid any incorrect assumptions about the sequence name. This resets the sequence in one shot: SELECT pg_catalog. setval(pg_get_serial_sequence('table_name', 'id'), (SELECT MAX(id) FROM table_name)+1);
COPY moves data between PostgreSQL tables and standard file-system files. COPY TO copies the contents of a table to a file, while COPY FROM copies data from a file to a table (appending the data to whatever is in the table already). COPY TO can also copy the results of a SELECT query.
If you COPY data into a table already containing data, the new data will be appended. If you COPY TO a file already containing data, the existing data will be overwritten.
A sequence in PostgreSQL is a user-defined schema-bound object that yields a sequence of integers based on a specified specification. The CREATE SEQUENCE statement is used to create sequences in PostgreSQL. Now let's analyze the above syntax: First, set the name of the sequence after the CREATE SEQUENCE clause.
You ask:
Should I manually update the sequence number to get the number of records after performing COPY?
Yes, you should, as documented here:
Update the sequence value after a COPY FROM:
| BEGIN; | COPY distributors FROM 'input_file'; | SELECT setval('serial', max(id)) FROM distributors; | END;
You write:
it didn’t increment the sequence ids during COPYing as work fine during normal INSERTing
But that is not so! :) When you perform a normal INSERT, typically you do not specify an explicit value for the SEQUENCE-backed primary key. If you did, you would run in to the same problems as you are having now:
postgres=> create table uh_oh (id serial not null primary key, data char(1));
NOTICE: CREATE TABLE will create implicit sequence "uh_oh_id_seq" for serial column "uh_oh.id"
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "uh_oh_pkey" for table "uh_oh"
CREATE TABLE
postgres=> insert into uh_oh (id, data) values (1, 'x');
INSERT 0 1
postgres=> insert into uh_oh (data) values ('a');
ERROR: duplicate key value violates unique constraint "uh_oh_pkey"
DETAIL: Key (id)=(1) already exists.
Your COPY command, of course, is supplying an explicit id
value, just like the example INSERT above.
I realize that this is a bit old but maybe someone might still be looking for the answer.
As other said COPY works in a similar way as INSERT, so for inserting into a table that has a sequence, you simply don't mention the sequence field at all and it is taken care of for you. For COPY it works in the same exact way. But doesn't it COPY require ALL fields in the table to be present in the text file? The correct answer is NO, it doesn't, but it is the default behavior.
To COPY and leave the sequence out do the following:
COPY $YOURSCHEMA.$YOURTABLE(col1,col2,col3,col4) FROM '$your_input_file' DELIMITER ',' CSV HEADER;
No need to manually update the schema afterwards, it works as intended and in my testing is just about as fast.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With