Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I pass values in ddply based on a column?

Tags:

dataframe

r

plyr

I want to be able to pass two sets of values GROUPED BY the column Category. Is there a way I can do this using ddply from package plyr?

I want to do something like this:

ddply(idata.frame(data), .(Category), wilcox.test, data[Type=="PRE",], data[Type=="POST",])

wilcox.test is the following function:

Description

Performs one- and two-sample Wilcoxon tests on vectors of data; the latter is also known as ‘Mann-Whitney’ test.

Usage

wilcox.test(x, ...)

Arguments

x   
numeric vector of data values. Non-finite (e.g. infinite or missing) values will be omitted.

y   
an optional numeric vector of data values: as with x non-finite values will be omitted.

.... rest of the arguments snipped ....

I have the following output from dput:

structure(list(Category = c("A", "C", 
"B", "C", "D", "E", 
"C", "A", "F", "B", 
"E", "C", "C", "A", 
"C", "A", "B", "H", 
"I", "A"), Type = c("POST", "POST", 
"POST", "POST", "PRE", "POST", "POST", "PRE", "POST", 
"POST", "POST", "POST", "POST", "PRE", "PRE", "POST", 
"POST", "POST", "POST", "POST"), Value = c(1560638113, 
1283621, 561329742, 2727503, 938032, 4233577690, 0, 4209749646, 
111467236, 174667894, 1071501854, 720499, 2195611, 1117814707, 
1181525, 1493315101, 253416809, 327012982, 538595522, 3023339026
)), .Names = c("Category", "Type", "Value"), row.names = c(21406L, 
123351L, 59875L, 45186L, 126720L, 94153L, 48067L, 159371L, 54303L, 
63318L, 104100L, 58162L, 41945L, 159794L, 57757L, 178622L, 83812L, 
130655L, 30860L, 24513L), class = "data.frame")

Any suggestions?

like image 584
Legend Avatar asked Dec 12 '25 14:12

Legend


1 Answers

What I always do is use an anonymous function:

ddply(idata.frame(data), .(Category), 
    function(x) wilcox.test(x[Type == "PRE",], x[Type == "POST",])

I'm not sure the wilcox.test functions returns something nice to concatenate into a data.frame by default, so you'll have to tweak yourself a bit. Alternatively, use dlply to end up with a list of wilcox.test output.

like image 153
Paul Hiemstra Avatar answered Dec 14 '25 05:12

Paul Hiemstra