Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

copy csv postgres ignore rows that violate constraints

I have a .csv file with ~300,000 rows, some of which violate certain constraints I set in my postgres database. Is there a way to copy my .csv file into the database and have postgres filter out the rows that violate the constraints? I do not want these rows to show up in the database.

If this is not possible, is there any other way to solve this problem?

what I'm doing right now is

COPY blocksequences from '/tmp/blocksequences.csv CSV HEADER;

And I get

'ERROR:  new row for relation "blocksequences" violates check constraint "blocksequences_partid3_check"
DETAIL:  Failing row contains (M001-M049-S186, M001, null, M049, S186).
CONTEXT:  COPY blocksequences, line 680: "M001-M049-S186,M001,,M049,S186"

reason for the error: column that contains M049 is not allowed to have that string entered. Many other rows have violations like this.

I read a little about exception when check violation --do nothing am I on the right track here? seems like it's only a mysql thing maybe

like image 956
Beeba Avatar asked Apr 20 '18 19:04

Beeba


1 Answers

Usually this is done in this way:

  • create a temporary table with the same structure as the destination one but without constraints,
  • copy data to the temporary table with COPY command,
  • copy rows that do fulfill constraints from temp table to the destination one, using INSERT command with conditions in the WHERE clause based on the table constraint,
  • drop the temporary table.

When dealing with really large CSV files or very limited server resources, use the extension file_fdw instead of temporary tables. It's much more efficient way but it requires server access to a CSV file (while copying to a temporary table can be done over the network).

In Postgres 12 you can use the WHERE clause in COPY FROM.

like image 196
klin Avatar answered Nov 16 '22 22:11

klin