Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Exporting a AWS Postgres RDS Table to AWS S3

I wanted to use AWS Data Pipeline to pipe data from a Postgres RDS to AWS S3. Does anybody know how this is done?

More precisely, I wanted to export a Postgres Table to AWS S3 using data Pipeline. The reason I am using Data Pipeline is I want to automate this process and this export is going to run once every week.

Any other suggestions will also work.

like image 306
error2007s Avatar asked Oct 06 '16 14:10

error2007s


People also ask

Is RDS stored on S3?

You can also use Amazon RDS stored procedures to list and delete files on the RDS instance. The files that you download from and upload to S3 are stored in the D:\S3 folder. This is the only folder that you can use to access your files.

How do I transfer data from AWS to S3?

To upload folders and files to an S3 bucketSign in to the AWS Management Console and open the Amazon S3 console at https://console.aws.amazon.com/s3/ . In the Buckets list, choose the name of the bucket that you want to upload your folders or files to. Choose Upload.

How do I connect to PostgreSQL S3 bucket?

Choose the bucket, open its Object overview page, and then choose Properties. Make a note of the bucket name, path, the AWS Region, and file type. You need the Amazon Resource Name (ARN) later, to set up access to Amazon S3 through an IAM role. For more more information, see Setting up access to an Amazon S3 bucket.


2 Answers

You can now do this with aws_s3.query_export_to_s3 command within postgres itself https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/postgresql-s3-export.html

like image 60
user433342 Avatar answered Sep 17 '22 12:09

user433342


You can define a copy-activity in the Data Pipeline interface to extract data from a Postgres RDS instance into S3.

  1. Create a data node of the type SqlDataNode. Specify table name and select query.
  2. Setup the database connection by specifying RDS instance ID (the instance ID is in your URL, e.g. your-instance-id.xxxxx.eu-west-1.rds.amazonaws.com) along with username, password and database name.
  3. Create a data node of the type S3DataNode.
  4. Create a Copy activity and set the SqlDataNode as input and the S3DataNode as output.

Another option is to use an external tool like Alooma. Alooma can replicate tables from PostgreSQL database hosted Amazon RDS to Amazon S3 (https://www.alooma.com/integrations/postgresql/s3). The process can be automated and you can run it once a week.

like image 38
Dina Kaiser Avatar answered Sep 20 '22 12:09

Dina Kaiser