Suppose I have a data frame with a column for values and another column for the number of times that value was observed:
x <- data.frame(value=c(1,2,3), count=c(4,2,1))
x
# value count
# 1 1 4
# 2 2 2
# 3 3 1
I know that I can get the weighted mean of the data using weighted.mean
and the weighted median using the weighted.median
function provided by several packages (e.g. limma
), but how can I get other weighted statistics on my data, such as 1st and 3rd quartiles, and maybe standard deviation? "Expanding" the data using rep
is not an option because sum(x$count)
is about 3 billion (the size of the human genome).
Have you tried these packages:
Hmisc
-- it has several weighted statistics, including weighted quantiles
laeken
-- it has weighted quantiles.
Or try to back-transform it, and run the analysis the usual way:
dtf <- data.frame(value = 1:3, count = c(4, 2, 1))
x <- with(dtf, rep(value, count))
summary(x)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.000 1.000 1.000 1.571 2.000 3.000
fivenum(x)
[1] 1 1 1 2 3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With