Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

postgresql COPY and CSV data w/ double quotes

Tags:

Example CSV line:

"2012","Test User","ABC","First","71.0","","","0","0","3","3","0","0","","0","","","","","0.1","","4.0","0.1","4.2","80.8","847" 

All values after "First" are numeric columns. Lots of NULL values just quoted as such, right.

Attempt at COPY:

copy mytable from 'myfile.csv' with csv header quote '"'; 

NOPE: ERROR: invalid input syntax for type numeric: ""

Well, yeah. It's a null value. Attempt 2 at COPY:

copy mytable from 'myfile.csv' with csv header quote '"' null '""'; 

NOPE: ERROR: CSV quote character must not appear in the NULL specification

What's a fella to do? Strip out all double quotes from the file before running COPY? Can do that, but I figured there's a proper solution to what must be an incredibly common problem.

like image 670
Wells Avatar asked Apr 17 '12 17:04

Wells


People also ask

How do I fix a double quote in a CSV file?

There are 2 accepted ways of escaping double-quotes in a CSV file. One is using a 2 consecutive double-quotes to denote 1 literal double-quote in the data. The alternative is using a backslash and a single double-quote.

Can you use double quotes in PostgreSQL?

In PostgreSQL, double quotes (like "a red dog") are always used to denote delimited identifiers. In this context, an identifier is the name of an object within PostgreSQL, such as a table name or a column name. Delimited identifiers are identifiers that have a specifically marked beginning and end.

How do you escape double quotes in Postgres?

Quotes and double quotes should be escaped using \.

What is the default Ascii quotation character in CSV mode?

Specifies the ASCII quotation character in CSV mode. The default is double-quote.


1 Answers

While some database products treat an empty string as a NULL value, the standard says that they are distinct, and PostgreSQL treats them as distinct.

It would be best if you could generate your CSV file with an unambiguous representation. While you could use sed or something to filter the file to good format, the other option would be to COPY the data in to a table where a text column could accept the empty strings, and then populate the target table. The NULLIF function may help with that: http://www.postgresql.org/docs/9.1/interactive/functions-conditional.html#FUNCTIONS-NULLIF -- it will return NULL if both arguments match and the first value if they don't. So, something like NULLIF(txtcol, '')::numeric might work for you.

like image 82
kgrittn Avatar answered Oct 20 '22 10:10

kgrittn