Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWS dynamodb support for "R" programming language

Has anyone been able to successfully CRUD records in amazon dynamodb using the R programming language? I found this reference of language bindings supported:

http://aws.typepad.com/aws/2012/04/amazon-dynamodb-libraries-mappers-and-mock-implementations-galore.html

Alas, no R. We are considering using dynamodb for a large scale data project, but our main analyst is most comfortable in R, so we are exploring our options.

like image 557
feathj Avatar asked Jan 08 '13 22:01

feathj


2 Answers

Here's a simplified version of what I'm using for reading data from DynamoDB into R. It relies on the fact that R and Python can exchange data, and a library called boto in Python makes it really easy to get data from DynamoDB. It would be neat if this was all an R package, but I won't complain given the 25GB of free storage you can get from Amazon.

First, you need a Python script like so named query_dynamo.py:

import boto3
import time

dynamodb = boto3.resource('dynamodb',
                          aws_access_key_id='<GET ME FROM AWS>',
                          aws_secret_access_key='<ALSO GET ME FROM AWS CONSOLE>',
                          region_name='us-east-1')

table = dynamodb.Table('comment')  ###Your table name in DynamoDB here

response = table.scan()
data = response['Items']

while 'LastEvaluatedKey' in response:
    response = table.scan(ExclusiveStartKey=response['LastEvaluatedKey'])
    data.extend(response['Items'])

Then in R you do this. If you're trying this on Windows, you may want to try rPython-win instead. I did all this on Ubuntu Linux 16.04 LTS.

library(rPython)


python.load("query_dynamo.py")
temp = as.data.frame(python.get('data'))
df = as.data.frame(t(temp))
rm(temp)

Now you'll have a dataframe called "df" with the contents of whatever you put in DynamoDB.

like image 198
CalZ Avatar answered Oct 20 '22 16:10

CalZ


For anyone who comes across this, there is now the Paws package, an AWS SDK for R. You can install it with install.packages("paws").

Disclaimer: I am a maintainer of the Paws package.

For example:

# Create a client object.
svc <- paws::dynamodb()

# This example retrieves an item from the Music table. The table has a
# partition key and a sort key (Artist and SongTitle), so you must specify
# both of these attributes.
item <- svc$get_item(
  Key = list(
    Artist = list(
      S = "Acme Band"
    ),
    SongTitle = list(
      S = "Happy Day"
    )
  ),
  TableName = "Music"
)

# This example adds a new item to the Music table.
svc$put_item(
  Item = list(
    AlbumTitle = list(
      S = "Somewhat Famous"
    ),
    Artist = list(
      S = "No One You Know"
    ),
    SongTitle = list(
      S = "Call Me Today"
    )
  ),
  ReturnConsumedCapacity = "TOTAL",
  TableName = "Music"
)
like image 36
David Kretch Avatar answered Oct 20 '22 17:10

David Kretch