Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ignore row if duplicate at CSV import

Tags:

csv

postgresql

I was wondering if it possible? If a row for some reason cannot be imported ex. duplicate primary key, wrong input type etc etc can it be ignored and move to the next row?

I'm getting this

ERROR:  duplicate key value violates unique constraint "team_pkey"
DETAIL:  Key (team)=(DEN) already exists.
CONTEXT:  COPY team, line 23: "DEN,Denver,Rockets,A"

There's a lot of mistakes in the file and its a pretty big one, so is it possible to ignore the rows that can't be inserted?

like image 919
LefterisL Avatar asked Oct 20 '22 16:10

LefterisL


1 Answers

A solution that handles the duplicate key issue is described in To ignore duplicate keys during 'copy from' in postgresql - in short using an unconstrained temp table and select distinct on uniquefield into the destination table.

Another way would involve using pgLoader. Unfortunately the documentation seems to have disappeared from the website, but there are several tutorial article on the author's site. It has rich functionality to help you read data with issues, and can do things like store rejected lines in a separate file, transform fields and so on.

Something that may not be obvious immediately: pgLoader version 2 is written in Python, version 3 is written in Lisp. Both can be obtained from the GitHub page.

like image 102
fvu Avatar answered Oct 27 '22 01:10

fvu