I wanted to use AWS Data Pipeline to pipe data from a Postgres RDS to AWS S3. Does anybody know how this is done?
More precisely, I wanted to export a Postgres Table to AWS S3 using data Pipeline. The reason I am using Data Pipeline is I want to automate this process and this export is going to run once every week.
Any other suggestions will also work.
You can also use Amazon RDS stored procedures to list and delete files on the RDS instance. The files that you download from and upload to S3 are stored in the D:\S3 folder. This is the only folder that you can use to access your files.
To upload folders and files to an S3 bucketSign in to the AWS Management Console and open the Amazon S3 console at https://console.aws.amazon.com/s3/ . In the Buckets list, choose the name of the bucket that you want to upload your folders or files to. Choose Upload.
Choose the bucket, open its Object overview page, and then choose Properties. Make a note of the bucket name, path, the AWS Region, and file type. You need the Amazon Resource Name (ARN) later, to set up access to Amazon S3 through an IAM role. For more more information, see Setting up access to an Amazon S3 bucket.
You can now do this with aws_s3.query_export_to_s3 command within postgres itself https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/postgresql-s3-export.html
You can define a copy-activity in the Data Pipeline interface to extract data from a Postgres RDS instance into S3.
Another option is to use an external tool like Alooma. Alooma can replicate tables from PostgreSQL database hosted Amazon RDS to Amazon S3 (https://www.alooma.com/integrations/postgresql/s3). The process can be automated and you can run it once a week.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With