Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

r language support for AWS DynamoDB [duplicate]

This is a follow up / updated question to this:

AWS dynamodb support for "R" programming language

I am looking for examples or documentation on how to read in a table from DynamoDB into R.

This question pointed me in the right direction:

R + httr and EC2 api authentication issues

(answered by the great @hadley himself!).

It's ok if I have to use httr and then parse a json response, but I can't even figure out how to format the POST request.

Thanks!

like image 596
JayCo Avatar asked Sep 21 '13 18:09

JayCo


People also ask

Does DynamoDB support replication?

Amazon DynamoDB: Available now – Cross-region Replication, Triggers, and Streams. Amazon DynamoDB now supports cross-region replication, a new feature that automatically replicates DynamoDB tables across AWS regions.

What languages does DynamoDB support?

Languages and frameworks with a DynamoDB binding include Java, JavaScript, Node. js, Go, C# . NET, Perl, PHP, Python, Ruby, Rust, Haskell, Erlang, Django, and Grails.

Does DynamoDB replicate data?

DynamoDB would then automatically replicate data changes among those tables so that changes to CustomerProfiles data in one Region would seamlessly propagate to the other Regions.

Which one of the following data types does Amazon DynamoDB not support?

Unlike conventional relational databases, DynamoDB does not natively support a date and time data type.


1 Answers

Repeating my answer from here since someone sent me this page asking a similar question.

Here's a simplified version of what I'm using for reading data from DynamoDB into R. It relies on the fact that R and Python can exchange data, and a library called boto in Python makes it really easy to get data from DynamoDB. It would be neat if this was all an R package, but I won't complain given the 25GB of free storage you can get from Amazon.

First, you need a Python script like so named query_dynamo.py:

import boto3
import time

dynamodb = boto3.resource('dynamodb',
                          aws_access_key_id='<GET ME FROM AWS>',
                          aws_secret_access_key='<ALSO GET ME FROM AWS CONSOLE>',
                          region_name='us-east-1')

table = dynamodb.Table('comment')  ###Your table name in DynamoDB here

response = table.scan()
data = response['Items']

while 'LastEvaluatedKey' in response:
    response = table.scan(ExclusiveStartKey=response['LastEvaluatedKey'])
    data.extend(response['Items'])

Then in R you do this. If you're trying this on Windows, you may want to try rPython-win instead. I did all this on Ubuntu Linux 16.04 LTS.

library(rPython)


python.load("query_dynamo.py")
temp = as.data.frame(python.get('data'))
df = as.data.frame(t(temp))
rm(temp)

Now you'll have a dataframe called "df" with the contents of whatever you put in DynamoDB.

like image 80
CalZ Avatar answered Oct 04 '22 07:10

CalZ