Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to import data files from s3 to postgresql rds

I am very new to AWS, and Postgresql.

  1. I have created a Postgresql db (using rds on aws)
  2. I have uploaded several documents to multiple s3 buckets
  3. I have a EC2 (Amazon Linux 64 bit) running

I tried to use a data pipeline, but nothing seems to be available (template) for Postgres. I can't figure out how to connect to my RDS instance and import/export data from postgres.

I assumed that I could use EC2 to grab from my S3 bucket and import into Postgres in lieu of no data pipeline template being available. If it is possible I have no idea how.. Please advise if possible..

like image 689
user3044239 Avatar asked Nov 28 '13 03:11

user3044239


People also ask

How do I transfer files from S3 to RDS?

Under Access management, choose Policies. Choose Create Policy. On the Visual editor tab, choose Choose a service, and then choose S3. For Actions, choose Expand all, and then choose the bucket permissions and object permissions required to transfer files from an Amazon S3 bucket to Amazon RDS.

How do I transfer data from AWS to RDS?

When importing data into a MariaDB DB instance, you can use MariaDB tools such as mysqldump, mysql, and standard replication to import data to Amazon RDS. Importing Data into PostgreSQL on Amazon RDS – You can use PostgreSQL tools such as pg_dump, psql, and the copy command to import data to Amazon RDS.

Does RDS store data in S3?

RDS is a managed database service. While it might internally store snapshots on an object storage service like S3, it won't expose that to you directly. It certainly won't use S3 for the actual primary RDS storage layer, of course, for performance reasons.


2 Answers

S3 -> RDS direct load is now possible for PostgreSQL Aurora and RDS PostgreSQL >= 11.1 as aws_s3 extension.

  • Amazon Aurora with PostgreSQL Compatibility Supports Data Import from Amazon S3
  • Amazon RDS for PostgreSQL Now Supports Data Import from Amazon S3

Parameters are similar to those of PostgreSQL COPY command

psql=> SELECT aws_s3.table_import_from_s3(  'table_name', '', '(format csv)',  'BUCKET_NAME', 'path/to/object', 'us-east-2' ); 

Be warned that this feature does not work for older versions.

like image 151
quiver Avatar answered Sep 24 '22 13:09

quiver


I wish AWS extends COPY command in RDS Postgresql as they did in Redshift. But for now they haven't and we have to do it by ourselves.

  1. Install awscli on your EC2 box (it might have been installed by default)
  2. Configure your awscli with credentials
  3. Use aws s3 sync or aws s3 cp commmands to download from s3 to your local directory
  4. Use psql command to \COPY the files into your RDS (requires \ to copy from client directory)

Example:

aws s3 cp s3://bucket/file.csv /mydirectory/file.csv psql -h your_rds.amazonaws.com -U username -d dbname -c '\COPY table FROM ''file.csv'' CSV HEADER' 
like image 32
jcz Avatar answered Sep 22 '22 13:09

jcz