Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Selecting a specific row from an rpy2 DataFrame

Tags:

rpy2

My data frame is survey data that I have got from a .csv file. One of the columns is age and I am looking to remove all respondents under 18 years of age. I'll then need to isolate age groups (18-24, 25-35, etc) into their own dataframes that I can do frequency distributions for.

The R code is simple enough:

x.sub <- subset(x.df, y > 2)

But I can't figure out how to use the r() function to get my dataframe variable from python into an R statement. It feels as though there ought to be a .subset() function in the rpy2 DataFrame class. But if it exists, I can't find it.

like image 637
forestfanjoe Avatar asked Dec 04 '10 20:12

forestfanjoe


1 Answers

Using rpy2 2.2.0-dev (should be the same with 2.1.x)

from rpy2.robjects.vectors import DataFrame
dataf = DataFrame.from_csvfile("my/file.csv")

dataf_subset = dataf.rx(dataf.rx2("age").ro >= 18, True)

That one exact example is not in the documentation (and may be should be there), but it's constituting elements are:extracting elements and R operators on vectors

like image 147
lgautier Avatar answered Sep 22 '22 05:09

lgautier