We have a set of tables in Redshift with columns having IDENTITY property in it, for sequence generating. During testing phase there is a requirement of taking a backup and restore and this is a repeatative activity for each cycle of testing. We followed the below processes to take backup and then restore and faced the below issues:
It will be very helpful if you could suggest the best possible way to backup and restore in my scenario OR the best practices organizations follow.
Backing up your Redshift Clusters Automated backups are enabled by default when you create your Redshift cluster. Amazon will always attempt to maintain at least three copies of the data - the original and replica on the compute nodes and a backup in Amazon S3 service (s3 - Simple Storage Service).
Data Durability: Amazon Redshift replicates your data within your data warehouse cluster and continuously backs up your data to Amazon S3, which is designed for eleven nines of durability. Amazon Redshift mirrors each drive's data to other nodes within your cluster.
Amazon Redshift is a fast, scalable, secure, and fully managed cloud data warehouse that makes it simple and cost-effective to analyze all your data using standard SQL and your existing ETL (extract, transform, and load), business intelligence (BI), and reporting tools.
The default retention period is one day, but you can modify it by using the Amazon Redshift console or programmatically by using the Amazon Redshift API or CLI. To disable automated snapshots, set the retention period to zero.
I would like to answer here point by point so it will be bit long, please excuse me for that;), but in my opinion, I feel that the best option is Unload to S3
and Copy to table from S3
. Here, S3 could be replace with EC2
.
There are some problems listed in question, but most of them are false or could be avoided by supplying proper export/import parameters. I would like to provide all necessary steps with data to prove my point that, there are no issues in dates
and timestamps
during the load and unload.
Here I'm doing most of data types to prove my point.
create table sales(
salesid integer not null Identity,
commission decimal(8,2),
saledate date,
description varchar(255),
created_at timestamp default sysdate,
updated_at timestamp);
Content in CSV(sales-example.txt)
salesid,commission,saledate,description,created_at,updated_at
1|3.55|2018-12-10|Test description|2018-05-17 23:54:51|2018-05-17 23:54:51
2|6.55|2018-01-01|Test description|2018-05-17 23:54:51|2018-05-17 23:54:51
4|7.55|2018-02-10|Test description|2018-05-17 23:54:51|2018-05-17 23:54:51
5|3.55||Test description|2018-05-17 23:54:51|2018-05-17 23:54:51
7|3.50|2018-10-10|Test description|2018-05-17 23:54:51|2018-05-17 23:54:51
Copy command that will import date
, timestamps
, as well as IDs.
copy sales(salesid,commission,saledate,description,created_at,updated_at) from 's3://****/de***/sales-example.txt' credentials 'aws_access_key_id=************;aws_secret_access_key=***********' IGNOREHEADER 1 EXPLICIT_IDS;
This will copy 5 records. I'm doing here parallel off
to get data in single CSV
to prove point, though not required and should be avoided.
unload ('select salesid,commission,saledate,description,created_at,updated_at from sales') to 's3://assortdw/development/sales-example-2.txt' credentials 'aws_access_key_id=***********;aws_secret_access_key=***********' parallel off;
And below is my content again that which is exactly same as of import, meaning if run the Copy
command to any other environment say dev
or QA
or somewhere, I will get the exact same records as of in Redshift
cluster.
5|3.55||Test description|2018-05-17 23:54:51|2018-05-17 23:54:51
1|3.55|2018-12-10|Test description|2018-05-17 23:54:51|2018-05-17 23:54:51
7|3.50|2018-10-10|Test description|2018-05-17 23:54:51|2018-05-17 23:54:51
2|6.55|2018-01-01|Test description|2018-05-17 23:54:51|2018-05-17 23:54:51
4|7.55|2018-02-10|Test description|2018-05-17 23:54:51|2018-05-17 23:54:51
I hope this answer the question, as well provide a start point to discuss/summarize/conclude
. All are most welcome to add your points.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With