I have a CSV file with some integer column, now it 's saved as "" (empty string).
I want to COPY them to a table as NULL value.
With JAVA code, I have try these:
String sql = "COPY " + tableName + " FROM STDIN (FORMAT csv,DELIMITER ',', HEADER true)";
String sql = "COPY " + tableName + " FROM STDIN (FORMAT csv,DELIMITER ',', NULL '' HEADER true)";
I get: PSQLException: ERROR: invalid input syntax for type numeric: ""
String sql = "COPY " + tableName + " FROM STDIN (FORMAT csv,DELIMITER ',', NULL '\"\"' HEADER true)";
I get: PSQLException: ERROR: CSV quote character must not appear in the NULL specification
Any one has done this before ?
Handling empty strings in PostgreSQL In Oracle, because empty strings are treated as NULL, the preceding insert statements #2 and #3 will store NULL for column tname in the table.
Oracle reads empty strings as NULLs, while PostgreSQL treats them as empty. Concatenating NULL values with non-NULL characters results in that character in Oracle, but NULL in PostgreSQL.
Example - With INSERT StatementINSERT INTO contacts (first_name, last_name) SELECT first_name, last_name FROM employees WHERE employee_number IS NULL; This PostgreSQL IS NULL example will insert records into the contacts table where the employee_number contains a NULL value.
The Java programming language distinguishes between null and empty strings. An empty string is a string instance of zero length, whereas a null string has no value at all.
PostgreSQL 's COPY handles this by quoting. A NULL is output as the NULL string and is not quoted, while a data value matching the NULL string is quoted. Therefore, using the default settings, a NULL is written as an unquoted empty string, while an empty string is written with double quotes (""). Reading values follows similar rules.
Oracle reads empty strings as NULLs, while PostgreSQL treats them as empty. Concatenating NULL values with non-NULL characters results in that character in Oracle, but NULL in PostgreSQL. 1. NULLS and empty strings in Oracle 2. NULLs and empty strings in PostgreSQL 3. NULLs and non-NULLs
Since Postgres 9.4 you now have the ability to use FORCE_NULL. This causes the empty string to be converted into a NULL. Very handy, especially with CSV files (actually this is only allowed when using CSV format). The syntax is as follow: COPY table FROM '/path/to/file.csv' WITH (FORMAT CSV, DELIMITER ';', FORCE_NULL (columnname));
When using COPY , there is an option called FORCE_NOT_NULL allowing to enforce a string to be not null even if it is not quoted. Here is an example of how it works: =# CREATE TABLE aa (a text); CREATE TABLE =# \COPY aa FROM STDIN WITH (FORMAT csv, FORCE_NOT_NULL (a)); Enter data to be copied followed by a newline.
I assume you are aware that numeric data types have no concept of "empty string" (''
) . It's either a number or NULL (or 'NaN' for numeric
- but not for integer
et al.)
Looks like you exported from a string data type like text
and had some actual empty string in there - which are now represented as ""
- "
being the default QUOTE
character in CSV format.
NULL would be represented by nothing, not even quotes. The manual:
NULL
Specifies the string that represents a null value. The default is
\N
(backslash-N) in text format, and an unquoted empty string in CSV format.
You cannot define ""
to generally represent NULL
since that already represents an empty string. Would be ambiguous.
To fix, I see two options:
Edit the CSV file / stream before feeding to COPY
and replace "" with nothing. Might be tricky if you have actual empty string in there as well - or ""
escaping literal "
inside strings.
(What I would do.) Import to an auxiliary temporary table with identical structure except for the integer
column converted to text
. Then INSERT
(or UPSERT?) to the target table from there, converting the integer
value properly on the fly:
-- empty temp table with identical structure
CREATE TEMP TABLE tbl_tmp AS TABLE tbl LIMIT 0;
-- ... except for the int / text column
ALTER TABLE tbl_tmp ALTER col_int TYPE text;
COPY tbl_tmp ...;
INSERT INTO tbl -- identical number and names of columns guaranteed
SELECT col1, col2, NULLIF(col_int, '')::int -- list all columns in order here
FROM tbl_tmp;
Temporary tables are dropped at the end of the session automatically. If you run this multiple times in the same session, either just truncate the existing temp table or drop it after each transaction.
Related:
Since Postgres 9.4 you now have the ability to use FORCE_NULL
. This causes the empty string to be converted into a NULL
. Very handy, especially with CSV files (actually this is only allowed when using CSV format).
The syntax is as follow:
COPY table FROM '/path/to/file.csv'
WITH (FORMAT CSV, DELIMITER ';', FORCE_NULL (columnname));
Further details are explained in the documentation: https://www.postgresql.org/docs/current/sql-copy.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With