Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unloading from redshift to s3 with headers

I already know how to unload a file from redshift into s3 as one file. I need to know how to unload with the column headers. Can anyone please help or give me a clue?

I don't want to manually have to do it in shell or python.

like image 975
Tokunbo Hiamang Avatar asked Jul 10 '14 16:07

Tokunbo Hiamang


People also ask

How do I unload data from Redshift to S3 with header?

As of cluster version 1.0. 3945, Redshift now supports unloading data to S3 with header rows in each file i.e. UNLOAD('select column1, column2 from mytable;') TO 's3://bucket/prefix/' IAM_ROLE '<role arn>' HEADER; Note: you can't use the HEADER option in conjunction with FIXEDWIDTH .

How do you unload data from Redshift to S3 in CSV?

Unload VENUE to a CSV file The following example unloads the VENUE table and writes the data in CSV format to s3://mybucket/unload/ . unload ('select * from venue') to 's3://mybucket/unload/' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' CSV; Suppose that the VENUE table contains the following rows.

Can you remove 000 file extension after a CSV unload from Amazon Redshift to S3?

Yes, because thats the naming convention unload uses in order to avoid duplicate names which I believe cannot be avoided from our end.


2 Answers

As of cluster version 1.0.3945, Redshift now supports unloading data to S3 with header rows in each file i.e.

UNLOAD('select column1, column2 from mytable;') TO 's3://bucket/prefix/' IAM_ROLE '<role arn>' HEADER; 

Note: you can't use the HEADER option in conjunction with FIXEDWIDTH.

https://docs.aws.amazon.com/redshift/latest/dg/r_UNLOAD.html

like image 187
fez Avatar answered Sep 24 '22 20:09

fez


If any of your columns are non-character, then you need to explicitly cast them as char or varchar because the UNION forces a cast.

Here is an example of the full statement that will create a file in S3 with the headers in the first row.

The output file will be a single CSV file with quotes.

This example assumes numeric values in column_1. You will need to adjust the ORDER BY clause to a numeric column to ensure the header row is in row 1 of the S3 file.

    ******************************************      /* Redshift export to S3 CSV single file with headers - limit 6.2GB */      UNLOAD ('          SELECT \'column_1\',\'column_2\'        UNION           SELECT              CAST(column_1 AS varchar(255)) AS column_1,           CAST(column_2 AS varchar(255)) AS column_2           FROM source_table_for_export_to_s3          ORDER BY 1 DESC        ;        ')      TO 's3://bucket/path/file_name_for_table_export_in_s3_' credentials      'aws_access_key_id=<key_with_no_<>_brackets>;aws_secret_access_key=<secret_access_key_with_no_<>_brackets>'        PARALLEL OFF        ESCAPE       ADDQUOTES       DELIMITER ','       ALLOWOVERWRITE       GZIP       ;       **************************************** 
like image 24
Douglas Hackney Avatar answered Sep 23 '22 20:09

Douglas Hackney