Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to turn off quote processing in the Postgres COPY command with CSV format?

I have CSV files, tab-separated, fields not wrapped in quotes, where field data can contain characters like single quotes, double quotes, pipes and backslashes.

Sample data can look like this:

1       2       "ba$aR\eR\       18 

I want to import this data into Postgres using the COPY statement.

When I try to import this using

COPY <tablename> FROM  <filename> NULL AS ''; 

I get an error psql:-:1: ERROR: missing data for column because Postgres is treating the backslash + tab as an "escaped tab" instead of a backslash followed by the field separator.

So I switched to using the "CSV format" of the COPY operator, like so:

COPY <tablename> FROM <filename> WITH CSV DELIMITER E'\t' NULL AS ''; 

Now there's a new error psql:-:1: ERROR: value too long for type character varying(254)

Apparently because it's interpreting the double-quote at the start of field 3 as the field wrapping character.

How can I specify that my data is NOT quoted at all?

like image 561
Tom De Leu Avatar asked Dec 05 '13 14:12

Tom De Leu


People also ask

How do I escape a quote from a CSV file?

By default, the escape character is a " (double quote) for CSV-formatted files. If you want to use a different escape character, use the ESCAPE clause of COPY , CREATE EXTERNAL TABLE or gpload to declare a different escape character.

Can CSV files have quotes?

You can put quotes, dashes and spaces in the CSV file. Fields that contain a special character (comma, newline, or double quote), must be enclosed in double quotes. If a field's value contains a double quote character it is escaped by placing another double quote character next to it.

How do you escape quotes in PostgreSQL?

Normally single and double quotes are commonly used with any text data in PostgreSQL. To ignore or escape the single quote is a common requirement of all database developers. By using double quotes and backslash we can avoid the complexity of single quotes as well as it is easy to read and maintain.

Why does my CSV file have quotation marks?

Quotation marks appear in CSV files as text qualifiers. This means, they function to wrap together text that should be kept as one value, versus what are distinct values that should be separated out.


1 Answers

Workaround (thanks to this comment!)

COPY <tablename> FROM <filename> WITH CSV DELIMITER E'\t' QUOTE E'\b' NULL AS ''; 

So basically specifying a quote character that should never be in the text, but that's pretty ugly.

I'd much prefer it if there was in fact a way to turn off quote processing altogether.

like image 79
Tom De Leu Avatar answered Sep 28 '22 06:09

Tom De Leu