I'm trying to calculate the weighted mean for group variable X1, across all numeric variables, here is some example data
set.seed(123)
X1=rep(c("A", "B", "C"), each = 4)
Y1=as.numeric(seq(1,12,by=1))
Y2=sample(1:5,12,TRUE)
Y3=sample(10:20,12,TRUE)
wgt <- abs(rnorm(12)*10)
df <- data.frame(X1,Y1,Y2,Y3,wgt)
This is the code I've been using to calculate regular mean values for X1
aggregate( df[, sapply(df, is.numeric)] , by=list(df$X1) , FUN=mean, na.rm=TRUE)
I want to calculate weight mean, weight variable is wgt. I tried both of these codes and neither work. I've tried numerous different ways and nothing is working.
aggregate( df[, sapply(df, is.numeric)] , by=list(df$X1) , FUN=weighted.mean(x, w=df$wgt), na.rm = TRUE)
aggregate( df[, sapply(df, is.numeric)] , by=list(df$X1) , FUN=weighted.mean, w=df$wgt, na.rm = TRUE)
I'm unable to adapt the weighted.mean function. Can anyone tell me where I'm going wrong? Can this function even be used in this situation? Any help is greatly appreciated. Thanks
Here is a way to compute the weighted means with aggregate
called by by()
.
res <- by(df, df$X1, function(DF){
aggregate(cbind(Y1, Y2, Y3) ~ X1, DF, function(y, w)
weighted.mean(y, w = DF[['wgt']], na.rm = TRUE))
})
do.call(rbind, res)
# X1 Y1 Y2 Y3
#A A 2.152503 2.633935 18.93457
#B B 6.677851 3.589251 16.90102
#C C 10.194695 2.638378 16.70958
You could use outer
to apply weighted.mean
crosswise.
gr <- c("A", "B", "C"); ys <- c("Y1", "Y2", "Y3")
WF <- Vectorize(function(x, y) with(df[df$X1 %in% x, ], weighted.mean(get(y), wgt)))
res <- `dimnames<-`(outer(gr, ys, WF), list(gr, ys))
res
# Y1 Y2 Y3
# A 2.152503 2.633935 18.93457
# B 6.677851 3.589251 16.90102
# C 10.194695 2.638378 16.70958
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With