I already know how to unload a file from redshift into s3 as one file. I need to know how to unload with the column headers. Can anyone please help or give me a clue? I don't want to manually have to do it in shell or python.

As of cluster version 1.0.3945, Redshift now supports unloading data to S3 with header rows in each file i.e. <pre class="prettyprint"><code>UNLOAD('select column1, column2 from mytable;') TO 's3://bucket/prefix/' IAM_ROLE '<role arn>' HEADER; </code></pre> Note: you can't use the <code>HEADER</code> option in conjunction with <code>FIXEDWIDTH</code>. https://docs.aws.amazon.com/redshift/latest/dg/r_UNLOAD.html

If any of your columns are non-character, then you need to explicitly cast them as char or varchar because the UNION forces a cast. Here is an example of the full statement that will create a file in S3 with the headers in the first row. The output file will be a single CSV file with quotes. This example assumes numeric values in column_1. You will need to adjust the ORDER BY clause to a numeric column to ensure the header row is in row 1 of the S3 file. <pre class="prettyprint"><code> ****************************************** /* Redshift export to S3 CSV single file with headers - limit 6.2GB */ UNLOAD (' SELECT \'column_1\',\'column_2\' UNION SELECT CAST(column_1 AS varchar(255)) AS column_1, CAST(column_2 AS varchar(255)) AS column_2 FROM source_table_for_export_to_s3 ORDER BY 1 DESC ; ') TO 's3://bucket/path/file_name_for_table_export_in_s3_' credentials 'aws_access_key_id=<key_with_no_<>_brackets>;aws_secret_access_key=<secret_access_key_with_no_<>_brackets>' PARALLEL OFF ESCAPE ADDQUOTES DELIMITER ',' ALLOWOVERWRITE GZIP ; **************************************** </code></pre>

Unloading from redshift to s3 with headers

2 Answers

As of cluster version 1.0.3945, Redshift now supports unloading data to S3 with header rows in each file i.e.

UNLOAD('select column1, column2 from mytable;') TO 's3://bucket/prefix/' IAM_ROLE '<role arn>' HEADER;

Note: you can't use the HEADER option in conjunction with FIXEDWIDTH.

https://docs.aws.amazon.com/redshift/latest/dg/r_UNLOAD.html

187

answered Sep 24 '22 20:09

fez

If any of your columns are non-character, then you need to explicitly cast them as char or varchar because the UNION forces a cast.

Here is an example of the full statement that will create a file in S3 with the headers in the first row.

The output file will be a single CSV file with quotes.

This example assumes numeric values in column_1. You will need to adjust the ORDER BY clause to a numeric column to ensure the header row is in row 1 of the S3 file.

    ******************************************      /* Redshift export to S3 CSV single file with headers - limit 6.2GB */      UNLOAD ('          SELECT \'column_1\',\'column_2\'        UNION           SELECT              CAST(column_1 AS varchar(255)) AS column_1,           CAST(column_2 AS varchar(255)) AS column_2           FROM source_table_for_export_to_s3          ORDER BY 1 DESC        ;        ')      TO 's3://bucket/path/file_name_for_table_export_in_s3_' credentials      'aws_access_key_id=<key_with_no_<>_brackets>;aws_secret_access_key=<secret_access_key_with_no_<>_brackets>'        PARALLEL OFF        ESCAPE       ADDQUOTES       DELIMITER ','       ALLOWOVERWRITE       GZIP       ;       ****************************************

answered Sep 23 '22 20:09

Douglas Hackney

Related questions
                            
                                Redshift DISTKEY / SORTKEY
                            
                                Amazon Redshift : drop table if exists
                            
                                Invalid digits on Redshift
                            
                                How to get a list of UDFs in Redshift?
                            
                                How to copy csv data file to Amazon RedShift?
                            
                                How to Load Data into Amazon Redshift via Python Boto3?
                            
                                How to Insert TIMESTAMP Column into Redshift
                            
                                Redshift INSERT INTO TABLE from CTE
                            
                                How to change table schema after created in Redshift?
                            
                                AWS Glue: How to handle nested JSON with varying schemas
                            
                                Pivot for redshift database
                            
                                AWS Kinesis Firehose not inserting data in Redshift
                            
                                Load CSV into Redshift, with header?
                            
                                How to create an Index in Amazon Redshift
                            
                                How to GROUP BY and CONCATENATE fields in redshift
                            
                                What does it mean to have multiple sortkey columns?
                            
                                Why "||" is used as string concatenation in PostgreSQL/Redshift [closed]
                            
                                Athena vs Redshift Spectrum
                            
                                JOIN (SELECT ... ) ue ON 1=1?
                            
                                How to write data to Redshift that is a result of a dataframe created in Python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Unloading from redshift to s3 with headers

Tags:

amazon-redshift

Tokunbo Hiamang

People also ask

2 Answers

fez

Douglas Hackney

Recent Activity

Donate For Us