Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Postgresql COPY empty string as NULL not work

I have a CSV file with some integer column, now it 's saved as "" (empty string).

I want to COPY them to a table as NULL value.

With JAVA code, I have try these:

String sql = "COPY " + tableName + " FROM STDIN (FORMAT csv,DELIMITER ',',  HEADER true)";
String sql = "COPY " + tableName + " FROM STDIN (FORMAT csv,DELIMITER ',', NULL ''  HEADER true)";

I get: PSQLException: ERROR: invalid input syntax for type numeric: ""

String sql = "COPY " + tableName + " FROM STDIN (FORMAT csv,DELIMITER ',', NULL '\"\"'  HEADER true)";

I get: PSQLException: ERROR: CSV quote character must not appear in the NULL specification

Any one has done this before ?

like image 290
Hieudien Avatar asked Aug 26 '17 06:08

Hieudien


People also ask

Does Postgres treat empty string as null?

Handling empty strings in PostgreSQL In Oracle, because empty strings are treated as NULL, the preceding insert statements #2 and #3 will store NULL for column tname in the table.

Is empty string PostgreSQL?

Oracle reads empty strings as NULLs, while PostgreSQL treats them as empty. Concatenating NULL values with non-NULL characters results in that character in Oracle, but NULL in PostgreSQL.

How do I get null records in PostgreSQL?

Example - With INSERT StatementINSERT INTO contacts (first_name, last_name) SELECT first_name, last_name FROM employees WHERE employee_number IS NULL; This PostgreSQL IS NULL example will insert records into the contacts table where the employee_number contains a NULL value.

Does null count as empty?

The Java programming language distinguishes between null and empty strings. An empty string is a string instance of zero length, whereas a null string has no value at all.

How does PostgreSQL copy handle null and empty values?

PostgreSQL 's COPY handles this by quoting. A NULL is output as the NULL string and is not quoted, while a data value matching the NULL string is quoted. Therefore, using the default settings, a NULL is written as an unquoted empty string, while an empty string is written with double quotes (""). Reading values follows similar rules.

What is the difference between empty string in Oracle and PostgreSQL?

Oracle reads empty strings as NULLs, while PostgreSQL treats them as empty. Concatenating NULL values with non-NULL characters results in that character in Oracle, but NULL in PostgreSQL. 1. NULLS and empty strings in Oracle 2. NULLs and empty strings in PostgreSQL 3. NULLs and non-NULLs

How do I force null In Postgres?

Since Postgres 9.4 you now have the ability to use FORCE_NULL. This causes the empty string to be converted into a NULL. Very handy, especially with CSV files (actually this is only allowed when using CSV format). The syntax is as follow: COPY table FROM '/path/to/file.csv' WITH (FORMAT CSV, DELIMITER ';', FORCE_NULL (columnname));

How to force a string to be not null when copying?

When using COPY , there is an option called FORCE_NOT_NULL allowing to enforce a string to be not null even if it is not quoted. Here is an example of how it works: =# CREATE TABLE aa (a text); CREATE TABLE =# \COPY aa FROM STDIN WITH (FORMAT csv, FORCE_NOT_NULL (a)); Enter data to be copied followed by a newline.


2 Answers

I assume you are aware that numeric data types have no concept of "empty string" ('') . It's either a number or NULL (or 'NaN' for numeric - but not for integer et al.)

Looks like you exported from a string data type like text and had some actual empty string in there - which are now represented as "" - " being the default QUOTE character in CSV format.

NULL would be represented by nothing, not even quotes. The manual:

NULL

Specifies the string that represents a null value. The default is \N (backslash-N) in text format, and an unquoted empty string in CSV format.

You cannot define "" to generally represent NULL since that already represents an empty string. Would be ambiguous.

To fix, I see two options:

  1. Edit the CSV file / stream before feeding to COPY and replace "" with nothing. Might be tricky if you have actual empty string in there as well - or "" escaping literal " inside strings.

  2. (What I would do.) Import to an auxiliary temporary table with identical structure except for the integer column converted to text. Then INSERT (or UPSERT?) to the target table from there, converting the integer value properly on the fly:

-- empty temp table with identical structure
CREATE TEMP TABLE tbl_tmp AS TABLE tbl LIMIT 0;

-- ... except for the int / text column
ALTER TABLE tbl_tmp ALTER col_int TYPE text;

COPY tbl_tmp ...;

INSERT INTO tbl  -- identical number and names of columns guaranteed
SELECT col1, col2, NULLIF(col_int, '')::int  -- list all columns in order here
FROM   tbl_tmp;

Temporary tables are dropped at the end of the session automatically. If you run this multiple times in the same session, either just truncate the existing temp table or drop it after each transaction.

Related:

  • How to update selected rows with values from a CSV file in Postgres?
  • Rails Migrations: tried to change the type of column from string to integer
  • postgresql thread safety for temporary tables
like image 95
Erwin Brandstetter Avatar answered Nov 03 '22 05:11

Erwin Brandstetter


Since Postgres 9.4 you now have the ability to use FORCE_NULL. This causes the empty string to be converted into a NULL. Very handy, especially with CSV files (actually this is only allowed when using CSV format).

The syntax is as follow:

COPY table FROM '/path/to/file.csv' 
WITH (FORMAT CSV, DELIMITER ';', FORCE_NULL (columnname));

Further details are explained in the documentation: https://www.postgresql.org/docs/current/sql-copy.html

like image 38
moojen Avatar answered Nov 03 '22 05:11

moojen