Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to calculate the mean with conditions? [duplicate]

Below is the script to generate a reproducible dataframe:

id <- c(1:20)
a <- as.numeric(round(runif(20,-40,40),2))
b <- as.numeric(round(a*1.4+60,2))
df <- as.data.frame(cbind(id, a, b))

i would like to calculate the mean of "b" under different condition for "a". for example, what is the mean of "b" when -40 =< a < 0; and what is the mean of "b" when 0=< a <=40.

Thank you very much!

like image 438
cyrusjan Avatar asked Feb 11 '23 22:02

cyrusjan


2 Answers

Here's a quick data.table solution (assuming coef is a)

library(data.table)
setDT(df)[, .(MeanASmall = mean(b[-40 <= a & a < 0]),
              MeanABig = mean(b[0 <= a & a <= 40]))]
#    MeanASmall MeanABig
# 1:   33.96727    89.46

If a range is limited, you could do this quickly with base R too

sapply(split(df, df$a >= 0), function(x) mean(x$b))
#     FALSE     TRUE 
#  33.96727 89.46000 
like image 171
David Arenburg Avatar answered Feb 13 '23 12:02

David Arenburg


The following solutions would do it:

Subset

ndf1<-subset(df, a>=-40 & a<=0)
ndf2<-subset(df, a>=0 & a<=40)

mean(ndf1[,3])
mean(ndf2[,3])

Or simpler

mean(df[a>=-40 & a<=0, 3]) 
mean(df[a>=0 & a<=40, 3]) 

Using ddply

library(plyr)
ddply(df, .(a>=-40 & a<=0), summarize, mean=mean(b))
ddply(df, .(a>=0 & a<=40), summarize, mean=mean(b))
like image 42
Ruthger Righart Avatar answered Feb 13 '23 10:02

Ruthger Righart