I would like to copy all the dynamoDB tables to another aws account without s3 to save the data. I saw solutions to copy table with data pipeline but all are using s3 to save the data. I would like to skip s3 step as the table contains a large amount of data so it may take time for s3 write and s3 read process. So I need to directly copy table from one account to another.
If you don't mind using Python, and add boto3 library (sudo python -m pip install boto3), then I'd do it like this (I assume you know how to fill the keys, regions and table names in code respectively):
import boto3 import os dynamoclient = boto3.client('dynamodb', region_name='eu-west-1', aws_access_key_id='ACCESS_KEY_SOURCE', aws_secret_access_key='SECRET_KEY_SOURCE') dynamotargetclient = boto3.client('dynamodb', region_name='us-west-1', aws_access_key_id='ACCESS_KEY_TARGET', aws_secret_access_key='SECRET_KEY_TARGET') dynamopaginator = dynamoclient.get_paginator('scan') tabname='SOURCE_TABLE_NAME' targettabname='TARGET_TABLE_NAME' dynamoresponse = dynamopaginator.paginate( TableName=tabname, Select='ALL_ATTRIBUTES', ReturnConsumedCapacity='NONE', ConsistentRead=True ) for page in dynamoresponse: for item in page['Items']: dynamotargetclient.put_item( TableName=targettabname, Item=item )
Try this nodejs
module
npm i copy-dynamodb-table
Simple backup and restore for Amazon DynamoDB using boto
https://github.com/bchew/dynamodump
which can do the following:
The reading and writing to S3 is not going to be your bottleneck.
While scanning from Dynamo is going to be very fast, writing the items to the destination table is going to be slow. You can only write up to 1000 items per second per partition. So, I wouldn't worry about the intermediate S3 storage.
However, Data Pipeline is also not the most efficient way of copying a table to another table either.
If you need speedy trasfers then your best bet is to implement your own solution. Provision the destination tables based on your transfer throughput desired (but be careful about undesired partition splits) and then write a parallel scan using multiple threads, which also writes to the destination table.
There is an open source implementation in Java that you can use as a starting point in the AWS labs repository.
https://github.com/awslabs/dynamodb-cross-region-library
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With