Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting a Pandas DataFrame to R dataframe using Rpy2

I have a pandas dataframe that I convert to R dataframe using the convert_to_r_dataframe method from pandas.rpy.common. I have it set up as such:

self.event = pd.read_csv('C://' + self.event_var.get() + '.csv')
final_products = pd.DataFrame({'Product': self.event.Product, 'Size': self.event.Size, 'Order': self.event.Order})
r.assign('final_products', com.convert_to_r_dataframe(final_products))
r.assign('EventName', self.event_var.get())
r.assign('EventTime', self.eventtime_var.get())
r.source('application.r')

where self.event_var.get() retrieves a user input in the GUI (I am creating an application using Tkinter). Product, Size, and Order are columns from the CSV file.

Since Rpy2 sets the R environment within Python, I would expect the final_products R dataframe to be understood by the R environment. Unfortunately, while the R script does run, it does not give the correct results (I create graphs using the R script but they are just empty when the program terminates). However, the EventName and EventTime variables do work. Is there something that I am missing here? Any ideas to why the assignment of the R dataframe within Python is not correctly being interpreted by the R environment?

The error obtained:

Exception in Tkinter callback
Traceback (most recent call last):
File "C:\Python27\lib\lib-tk\Tkinter.py", line 1470, in __call__
    return self.func(*args)
File "G:\Development\workspace\GUI\GUI.py", line 126, in evaluate
    r.source('application.r')
File "C:\Python27\lib\site-packages\rpy2\robjects\functions.py", line 86, in __call__
    return super(SignatureTranslatedFunction, self).__call__(*args, **kwargs)
File "C:\Python27\lib\site-packages\rpy2\robjects\functions.py", line 35, in __call__
    res = super(Function, self).__call__(*new_args, **new_kwargs)
like image 361
KidSudi Avatar asked Mar 11 '14 12:03

KidSudi


1 Answers

Great answer @Mittenchops. Since convert_to_r_dataframe is deprecated. Updating the above example with rpy2 interface

from rpy2.robjects import pandas2ri
pandas2ri.activate()

import pandas as pd
import numpy as np
from datetime import datetime
n = 10
df = pd.DataFrame({
    "timestamp": [datetime.now() for t in range(n)],
    "value": np.random.uniform(-1, 1, n)
})
r_dataframe = pandas2ri.py2ri(df)
print(r_dataframe)
like image 93
Pramit Avatar answered Sep 21 '22 05:09

Pramit