Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to move data from Glue to Dynamodb

We are designing an Big data solution for one of our dashboard applications and seriously considering Glue for our initial ETL. Currently Glue supports JDBC and S3 as the target but our downstream services and components will work better with dynamodb. We are wondering what is the best approach to eventually move the records from Glue to Dynamo.

Should we write to S3 first and then run lambdas to insert the data into Dynamo? Is that the best practice? OR Should we use a third party JDBC wrapper for Dynamodb and use Glue to directly write to Dynamo (Not sure if this is possible, sounds a bit scary) OR Should we do something else?

Any help is greatly appreciated. Thanks!

like image 666
Robby Avatar asked Mar 02 '18 05:03

Robby


People also ask

Can AWS Glue connect to DynamoDB?

You can now crawl your Amazon DynamoDB tables, extract associated metadata, and add it to the AWS Glue Data Catalog.

Can glue write to DynamoDB?

AWS Glue supports writing data into another AWS account's DynamoDB table.


2 Answers

You can add the following lines to your Glue ETL script:

    glueContext.write_dynamic_frame.from_options(frame =DynamicFrame.fromDF(df, glueContext, "final_df"), connection_type = "dynamodb", connection_options = {"tableName": "pceg_ae_test"})

df should be of type DynamicFrame

like image 64
Bishal Regmi Avatar answered Oct 26 '22 22:10

Bishal Regmi


I am able to write using boto3... definitly its not best approach to load but its working one. :)

dynamodb = boto3.resource('dynamodb','us-east-1') table = 
dynamodb.Table('BULK_DELIVERY')

print "Start testing"

for row in df1.rdd.collect():
    var1=row.sourceCid 
    print(var1) table.put_item( Item={'SOURCECID': "{}".format(var1)} )

print "End testing"
like image 35
Vinay Agarwal Avatar answered Oct 26 '22 23:10

Vinay Agarwal