Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Copy dynamoDB table to another aws account without S3

I would like to copy all the dynamoDB tables to another aws account without s3 to save the data. I saw solutions to copy table with data pipeline but all are using s3 to save the data. I would like to skip s3 step as the table contains a large amount of data so it may take time for s3 write and s3 read process. So I need to directly copy table from one account to another.

like image 916
Akhil N Avatar asked Apr 25 '17 10:04

Akhil N


4 Answers

If you don't mind using Python, and add boto3 library (sudo python -m pip install boto3), then I'd do it like this (I assume you know how to fill the keys, regions and table names in code respectively):

import boto3 import os  dynamoclient = boto3.client('dynamodb', region_name='eu-west-1',     aws_access_key_id='ACCESS_KEY_SOURCE',     aws_secret_access_key='SECRET_KEY_SOURCE')  dynamotargetclient = boto3.client('dynamodb', region_name='us-west-1',     aws_access_key_id='ACCESS_KEY_TARGET',     aws_secret_access_key='SECRET_KEY_TARGET')  dynamopaginator = dynamoclient.get_paginator('scan') tabname='SOURCE_TABLE_NAME' targettabname='TARGET_TABLE_NAME' dynamoresponse = dynamopaginator.paginate(     TableName=tabname,     Select='ALL_ATTRIBUTES',     ReturnConsumedCapacity='NONE',     ConsistentRead=True ) for page in dynamoresponse:     for item in page['Items']:         dynamotargetclient.put_item(             TableName=targettabname,             Item=item         ) 
like image 161
Adam Owczarczyk Avatar answered Sep 21 '22 20:09

Adam Owczarczyk


Try this nodejs module

npm i copy-dynamodb-table 
like image 22
Ezzat Avatar answered Sep 20 '22 20:09

Ezzat


Simple backup and restore for Amazon DynamoDB using boto

https://github.com/bchew/dynamodump

which can do the following:

  • Single table backup/restore
  • Multiple table backup/restore
  • Multiple table backup/restore but between different environments (e.g. production-* tables to development-* tables)
  • Backup all tables and restore only data (will not delete and recreate schema)
  • Dump all table schemas and create the schemas (e.g. creating blank tables in a different AWS account)
  • Backup all tables based on AWS tag key=value
  • Backup all tables based on AWS tag, compress and store in specified S3 bucket.
  • Restore from S3 bucket to specified destination table
like image 39
RNA Avatar answered Sep 19 '22 20:09

RNA


The reading and writing to S3 is not going to be your bottleneck.

While scanning from Dynamo is going to be very fast, writing the items to the destination table is going to be slow. You can only write up to 1000 items per second per partition. So, I wouldn't worry about the intermediate S3 storage.

However, Data Pipeline is also not the most efficient way of copying a table to another table either.

If you need speedy trasfers then your best bet is to implement your own solution. Provision the destination tables based on your transfer throughput desired (but be careful about undesired partition splits) and then write a parallel scan using multiple threads, which also writes to the destination table.

There is an open source implementation in Java that you can use as a starting point in the AWS labs repository.

https://github.com/awslabs/dynamodb-cross-region-library

like image 36
Mike Dinescu Avatar answered Sep 18 '22 20:09

Mike Dinescu