Is there a way to do a SQL dump from Amazon Redshift?
Could you use the SQL workbench/J client?
The first method of extracting data from AWS Redshift through SQL involves transfers to Amazon S3 files, a part of Amazon web services. You can run the process by unloadingAWS data into S3 buckets and using SSIS (SQL Server Integration Services) for copying data into SQL servers.
Amazon Redshift is built around industry-standard SQL, with added functionality to manage very large datasets and support high-performance analysis and reporting of those data.
Amazon Redshift supports SQL client tools connecting through Java Database Connectivity (JDBC) and Open Database Connectivity (ODBC). Amazon Redshift doesn't provide or install any SQL client tools or libraries, so you must install them on your client computer or Amazon EC2 instance to use them.
pg_dump
of schemas may not have worked in the past, but it does now.
pg_dump -Cs -h my.redshift.server.com -p 5439 database_name > database_name.sql
CAVEAT EMPTOR: pg_dump
still produces some postgres specific syntax, and also neglects the Redshift SORTKEY
and DISTSTYLE
definitions for your tables.
Another decent option is to use the published AWS admin script views for generating your DDL. It handles the SORTKEY/DISTSTYLE, but I've found it to be buggy when it comes to capturing all FOREIGN KEYs, and doesn't handle table permissions/owners. Your milage may vary.
To get a dump of the data itself, you still need to use the UNLOAD
command on each table unfortunately.
Here's a way to generate it. Be aware that select *
syntax will fail if your destination table does not have the same column order as your source table:
select
ist.table_schema,
ist.table_name,
'unload (''select col1,col2,etc from "' || ist.table_schema || '"."' || ist.table_name || '"'')
to ''s3://SOME/FOLDER/STRUCTURE/' || ist.table_schema || '.' || ist.table_name || '__''
credentials ''aws_access_key_id=KEY;aws_secret_access_key=SECRET''
delimiter as '',''
gzip
escape
addquotes
null as ''''
--encrypted
--parallel off
--allowoverwrite
;'
from information_schema.tables ist
where ist.table_schema not in ('pg_catalog')
order by ist.table_schema, ist.table_name
;
We are currently using Workbench/J successfuly with Redshift.
Regarding dumps, at the time there is no schema export tool available in Redshift (pg_dump doesn't work), although data can always be extracted via queries.
Hope to help.
EDIT: Remember that things like sort and distribution keys are not reflected on the code generated by Workbench/J. Take a look to the system table pg_table_def
to see info on every field. It states if a field is sortkey or distkey, and such info. Documentation on that table:
http://docs.aws.amazon.com/redshift/latest/dg/r_PG_TABLE_DEF.html
Yes, you can do so via several ways.
UNLOAD() to an S3 Bucket- Thats the best. You can get your data on almost any other machine. (More info here: http://docs.aws.amazon.com/redshift/latest/dg/r_UNLOAD.html)
Pipe the contents of your table to a data file using the Linux instance you have. So, running:
$> psql -t -A -F 'your_delimiter' -h 'hostname' -d 'database' -U 'user' -c "select * from myTable" >> /home/userA/tableDataFile will do the trick for you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With