Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R programming: plyr how to count values from a column with ddply [duplicate]

Tags:

r

plyr

I would like to summarize the pass/fail status for my data as below. In other words, I would like to tell the number of pass and fail cases for each product/type.

library(ggplot2)
library(plyr)
product=c("p1","p1","p1","p1","p1","p1","p1","p1","p1","p1","p1","p1","p2","p2","p2","p2","p2","p2","p2","p2","p2","p2","p2","p2")
type=c("t1","t1","t1","t1","t1","t1","t2","t2","t2","t2","t2","t2","t1","t1","t1","t1","t1","t1","t2","t2","t2","t2","t2","t2")
skew=c("s1","s1","s1","s2","s2","s2","s1","s1","s1","s2","s2","s2","s1","s1","s1","s2","s2","s2","s1","s1","s1","s2","s2","s2")
color=c("c1","c2","c3","c1","c2","c3","c1","c2","c3","c1","c2","c3","c1","c2","c3","c1","c2","c3","c1","c2","c3","c1","c2","c3")
result=c("pass","pass","fail","pass","pass","pass","fail","pass","fail","pass","fail","pass","fail","pass","fail","pass","pass","pass","pass","fail","fail","pass","pass","fail")
df = data.frame(product, type, skew, color, result)

The following cmd returns the total number of pass+fail cases but I want separate columns for pass and fail

dfSummary <- ddply(df, c("product", "type"), summarise, N=length(result))

Result is:

        product type N
 1      p1      t1   6
 2      p1      t2   6
 3      p2      t1   6
 4      p2      t2   6

The desireable result would be

         product type Pass Fail
 1       p1      t1   5    1
 2       p1      t2   3    3
 3       p2      t1   4    2
 4       p2      t2   3    3

I have attempted somthing like:

 dfSummary <- ddply(df, c("product", "type"), summarise, Pass=length(df$product[df$result=="pass"]), Fail=length(df$product[df$result=="fail"]) )

but obviously it’s wrong since the results are the grand totatl for fail and pass.

Thanks in advance for your advice ! Regards, Riad.

like image 679
Riad Avatar asked Nov 20 '13 17:11

Riad


People also ask

How do I count repeated data in R?

Use the length() function to count the number of elements returned by the which() function, as which function returns the elements that are repeated more than once. The length() function in R Language is used to get or set the length of a vector (list) or other objects.

How do I count the number of a specific value in R?

Method 2: Using sum() method in R The sum() method can be used to calculate the summation of the values appearing in the function argument. Here, we specify a logical expression as an argument of the sum() function which calculates the sum of values which are equivalent to the specified value.

What does count () do in R?

count() lets you quickly count the unique values of one or more variables: df %>% count(a, b) is roughly equivalent to df %>% group_by(a, b) %>% summarise(n = n()) . count() is paired with tally() , a lower-level helper that is equivalent to df %>% summarise(n = n()) .


2 Answers

Try:

dfSummary <- ddply(df, c("product", "type"), summarise, 
                   Pass=sum(result=="pass"), Fail=sum(result=="fail") )

Which gives me result:

  product type Pass Fail
1      p1   t1    5    1
2      p1   t2    3    3
3      p2   t1    4    2
4      p2   t2    3    3

Explanation:

  1. You are giving the data set, df to the ddply function.
  2. ddply is splitting on the variables, "product" and "type"
    • This results in length(unique(product)) * length(unique(type)) pieces (i.e. subsets of the data df) split on every combination of the two variables.
  3. With each of the pieces, ddply applies some function that you provide. In this case, you count the number of result=="pass" and result=="fail" there are.
  4. Now ddply is left with some results for each piece, namely the variables you split on (product and type) and the results you requested (Pass and Fail).
  5. It combines all of the pieces together and returns it
like image 121
ialm Avatar answered Sep 21 '22 03:09

ialm


You could also use reshape2::dcast.

library(reshape2)
dcast(product + type~result,data=df, fun.aggregate= length,value.var = 'result')
##   product type fail pass
## 1      p1   t1    1    5
## 2      p1   t2    3    3
## 3      p2   t1    2    4
## 4      p2   t2    3    3
like image 32
mnel Avatar answered Sep 18 '22 03:09

mnel