Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Same function over multiple data frames in R

I am new to R, and this is a very simple question. I've found a lot of similar things to what I want but not exactly it. Basically I have multiple data frames and I simply want to run the same function across all of them. A for-loop could work but I'm not sure how to set it up properly to call data frames. It also seems most prefer the lapply approach with R. I've played with the get function as well to no avail. I apologize if this is a duplicated question. Any help would be greatly appreciated!

Here's my over simplified example: 2 data frames: df1, df2

df1
start stop ID
0     10   x
10    20   y
20    30   z

df2
start stop ID
0     10   a
10    20   b
20    30   c

what I want is a 4th column with the average of start and stop for both dfs

df1
start stop ID  Avg
0     10   x    5 
10    20   y    15
20    30   z    25

I can do this one data frame at a time with:

df1$Avg <- rowMeans(subset(df1, select = c(start, stop)), na.rm = TRUE)

but I want to run it on all of the dataframes.

like image 314
user3272284 Avatar asked Feb 25 '14 01:02

user3272284


1 Answers

Make a list of data frames then use lapply to apply the function to them all.

df.list <- list(df1,df2,...)
res <- lapply(df.list, function(x) rowMeans(subset(x, select = c(start, stop)), na.rm = TRUE))
# to keep the original data.frame also
res <- lapply(df.list, function(x) cbind(x,"rowmean"=rowMeans(subset(x, select = c(start, stop)), na.rm = TRUE)))

The lapply will then feed in each data frame as x sequentially.

like image 159
JeremyS Avatar answered Sep 19 '22 18:09

JeremyS