Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Redshift DEFAULT GETDATE() working on INSERT but not COPY

I have a column with a default constraint in my Redshift table so that the current timestamp will be populated for it.

CREATE TABLE test_table(
    ...
    etl_date_time timestamp DEFAULT GETDATE(),
    ...
);

This works as expected on INSERTS, but I still get null values when copying a json file from S3 that has no key for this column

COPY test_table FROM 's3://bucket/test_file.json' 
CREDENTIALS '...' FORMAT AS JSON 'auto';

// There shouldn't be any NULLs here, but there are
select count(*) from test_table where etl_date_time is null;

I have also tried putting a null value for the key in the source JSON, but that resulted in NULL values in the table as well.

{
    ...
    "etl_date_time": null,
    ...
}
like image 638
csab Avatar asked Jul 27 '16 17:07

csab


People also ask

How do I get the current date in redshift?

CURRENT_DATE returns a date in the current session time zone (UTC by default) in the default format: YYYY-MM-DD. CURRENT_DATE returns the start date for the current transaction, not for the start of the current statement.

Where is redshift Copy command?

The COPY command is an extension of SQL supported by Redshift. Therefore, the COPY command needs to be issued from an SQL client. You mention that you have configured SQL Workbench. Once you connect to the Redshift cluster, run the command from within that connection.

Can Copy command Create Table in redshift?

Amazon Redshift Spectrum external tables are read-only. You can't COPY to an external table. The COPY command appends the new input data to any existing rows in the table.


1 Answers

If the field is always NULL, consider omitting it from the files at S3 at all. COPY let's you specify the columns you intend to copy and will populate missing ones with their DEFAULT values.

So for the file data.json:

{"col1":"r1_val1", "col3":"r1_val2"}
{"col1":"r2_val1", "col3":"r2_val2"}

And the table definition:

create table _test (
    col1 varchar(20)
  , col2 timestamp default getdate()
  , col3 varchar(20)
);

Specific column names

The COPY command with explicit column names

copy _test(col1,col3) from 's3://bucket/data.json' format as json 'auto'

Would yield the following result:

db=# select * from _test;
  col1   |        col2         |  col3
---------+---------------------+---------
 r1_val1 | 2016-07-27 18:27:08 | r1_val2
 r2_val1 | 2016-07-27 18:27:08 | r2_val2
(2 rows)

Omitted column names

If the column names are omitted,

copy _test from 's3://bucket/data.json' format as json 'auto'

Would never use the DEFAULT but insert NULL instead:

db=# select * from _test;
  col1   |        col2         |  col3
---------+---------------------+---------
 r1_val1 |                     | r1_val2
 r2_val1 |                     | r2_val2
(2 rows)
like image 195
moertel Avatar answered Sep 22 '22 23:09

moertel