Example CSV line:
"2012","Test User","ABC","First","71.0","","","0","0","3","3","0","0","","0","","","","","0.1","","4.0","0.1","4.2","80.8","847"
All values after "First" are numeric columns. Lots of NULL values just quoted as such, right.
Attempt at COPY:
copy mytable from 'myfile.csv' with csv header quote '"';
NOPE: ERROR: invalid input syntax for type numeric: ""
Well, yeah. It's a null value. Attempt 2 at COPY:
copy mytable from 'myfile.csv' with csv header quote '"' null '""';
NOPE: ERROR: CSV quote character must not appear in the NULL specification
What's a fella to do? Strip out all double quotes from the file before running COPY
? Can do that, but I figured there's a proper solution to what must be an incredibly common problem.
There are 2 accepted ways of escaping double-quotes in a CSV file. One is using a 2 consecutive double-quotes to denote 1 literal double-quote in the data. The alternative is using a backslash and a single double-quote.
In PostgreSQL, double quotes (like "a red dog") are always used to denote delimited identifiers. In this context, an identifier is the name of an object within PostgreSQL, such as a table name or a column name. Delimited identifiers are identifiers that have a specifically marked beginning and end.
Quotes and double quotes should be escaped using \.
Specifies the ASCII quotation character in CSV mode. The default is double-quote.
While some database products treat an empty string as a NULL value, the standard says that they are distinct, and PostgreSQL treats them as distinct.
It would be best if you could generate your CSV file with an unambiguous representation. While you could use sed or something to filter the file to good format, the other option would be to COPY
the data in to a table where a text
column could accept the empty strings, and then populate the target table. The NULLIF
function may help with that: http://www.postgresql.org/docs/9.1/interactive/functions-conditional.html#FUNCTIONS-NULLIF -- it will return NULL if both arguments match and the first value if they don't. So, something like NULLIF(txtcol, '')::numeric
might work for you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With