Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

alternative for wilcox.test in R

Tags:

r

I'm trying a significance test using wilcox.test in R. I want to basically test if a value x is significantly within/outside a distribution d.

I'm doing the following:

d = c(90,99,60,80,80,90,90,54,65,100,90,90,90,90,90)
wilcox.test(60,d)



    Wilcoxon rank sum test with continuity correction

data:  60 and d
W = 4.5, p-value = 0.5347
alternative hypothesis: true location shift is not equal to 0

Warning message:
In wilcox.test.default(60, d) : cannot compute exact p-value with ties

and basically the p-value is the same for a big range of numbers i test.

I've tried wilcox_test() from the coin package, but i can't get it to work testing a value against a distribution.

Is there an alternative to this test that does the same and knows how to deal with ties?

like image 930
ifreak Avatar asked Mar 19 '23 10:03

ifreak


1 Answers

How worried are you about the non-exact results? I would guess that the approximation is reasonable for a data set this size. (I did manage to get coin::wilcox_test working, and the results are not hugely different ...)

d <- c(90,99,60,80,80,90,90,54,65,100,90,90,90,90,90)
pfun <- function(x) {
    suppressWarnings(w <- wilcox.test(x,d)$p.value)
    return(w)
}
testvec <- 30:120
p1 <- sapply(testvec,pfun)
library("coin")
pfun2 <- function(x) {
    dd <- data.frame(y=c(x,d),f=factor(c(1,rep(2,length(d)))))
    return(pvalue(wilcox_test(y~f,data=dd)))
}
p2 <- sapply(testvec,pfun2)
library("exactRankTests")
pfun3 <- function(x) {wilcox.exact(x,d)$p.value}
p3 <- sapply(testvec,pfun3)

Picture:

par(las=1,bty="l")
matplot(testvec,cbind(p1,p2,p3),type="s",
      xlab="value",ylab="p value of wilcoxon test",lty=1,
        ylim=c(0,1),col=c(1,2,4))
legend("topright",c("stats::wilcox.test","coin::wilcox_test",
                    "exactRankTests::wilcox.exact"),
       lty=1,col=c(1,2,4))

enter image description here

(exactRankTests added by request, but given that it's not maintained any more and recommends the coin package, I'm not sure how reliable it is. You're on your own for figuring out what the differences among these procedures are and which would be best to use ...)

The results make sense here -- the problem is just that your power is low. If your value is completely outside the range of the data, for n=15, that will be a probability of something like 2*(1/16)=0.125 [i.e. probability of your sample ending up as the first or the last element in a permutation], which is not quite the same as the minimum value here (wilcox.test: p=0.105, wilcox_test: p=0.08), but that might be an approximation issue, or I might have some detail wrong. Nevertheless, it's in the right ballpark.

like image 116
Ben Bolker Avatar answered Mar 28 '23 01:03

Ben Bolker