Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get and process entire row in ddply in a function

Tags:

r

plyr

It's easy to grab one or more in ddply to process, but is there a way to grab the entire current row and pass that onto a function? Or to grab a set of columns determined at runtime?

Let me illustrate:

Given a dataframe like

df = data.frame(a=seq(1,20), b=seq(1,5), c= seq(5,1))
df
    a b c
1   1 1 5
2   2 2 4
3   3 3 3

I could write a function to sum named columns along a row of a data frame like this:

selectiveSummer = function(row,colsToSum) {
   return(sum(row[,colsToSum])) 
}

It works when I call it for a row like this:

> selectiveSummer(df[1,],c('a','c'))
[1] 6

So I'd like to wrap that in an anonymous function and use it in ddply to apply it to every row in the table, something like the example below

f = function(x) { selectiveSummer(x,c('a','c')) }
#this doesn't work!
ddply(df,.(a,b,c), transform, foo=f(row))

I'd like to find a solution where the set of columns to manipulate can be determined at runtime, so if there's some way just to splat that from ddply's args and pass it into a function that takes any number of args, that works too.

Edit: To be clear, the real application driving this isn't sum, but this was an easier explanation

like image 666
jkebinger Avatar asked Nov 26 '25 20:11

jkebinger


1 Answers

You can only select single rows with ddply if rows can be identified in a unique way with one or more variables. If there are identical rows ddply will cycle over data frames of multiple rows even if you use all columns (like ddply(df, names(df), f).

Why not use apply instead? Apply does iterate over individual rows.

apply(df, 1, function(x) f(as.data.frame(t(x)))))

result:

[1]  6  6  6  6  6 11 11 11 11 11 16 16 16 16 16 21 21 21 21 21
like image 189
GaBorgulya Avatar answered Nov 28 '25 16:11

GaBorgulya