coin::wilcox_test versus wilcox.test in R

Question

In trying to figure out which one is better to use I have come across two issues.

1) The W statistic given by wilcox.test is different from that of coin::wilcox_test. Here's my output:

wilcox_test:

Exact Wilcoxon Mann-Whitney Rank Sum Test

data:  data$variableX by data$group (yes, no) 
Z = -0.7636, p-value = 0.4489
alternative hypothesis: true mu is not equal to 0

wilcox.test:

Wilcoxon rank sum test with continuity correction

data:  data$variable by data$group
W = 677.5, p-value = 0.448
alternative hypothesis: true location shift is not equal to 0

I'm aware that there's actually two values for W and that the smaller one is usually reported. When wilcox.test is used with comma instead of "~" I can get the other value, but this comes up as W = 834.5. From what I understand, coin::statistic() can return three different statistics using ("linear", "standarized", and "test") where "linear" is the normal W and "standardized" is just the W converted to a z-score. None of these match up to the W I get from wilcox.test though (linear = 1055.5, standardized = 0.7636288, test = -0.7636288). Any ideas what's going on?

2) I like the options in wilcox_test for "distribution" and "ties.method", but it seems that you can not apply a continuity correction like in wilcox.test. Am I right?

G Chalancon · Accepted Answer

I encountered the same issue when trying to apply Wendt formula to compute effect sizes using the coin package, and obtained aberrant r values due to the fact that the linear statistic outputted by wilcox_test() is unadjusted.

A great explanation is already given here, and therefore I will simply address how to obtain adjusted U statistics with the wilcox_test() function. Let's use a the following data frame:

d <- data.frame( x = c(rnorm(n = 60, mean = 10, sd = 5), rnorm(n = 30, mean = 16, sd = 5)), 
                 g = c(rep("a",times = 60), rep("b",times = 30)) )

We can perform identical tests with wilcox.test() and wilcox_test():

 w1 <- wilcox.test( formula = x ~ g, data = d ) 
 w2 <- wilcox_test( formula = x ~ g, data = d )

Which will output two distinct statistics:

> w1$statistic
   W 
 321 

> w2@statistic@linearstatistic
[1] 2151

The values are indeed totally different (albeit the tests are equivalent).

To obtain the U statistics identical to that of wilcox.test(), you need to subtract wilcox_test()'s output statistic by the minimal value that the sum of the ranks of the reference sample can take, which is n_1(n_1+1)/2.

Both commands take the first level in the factor of your grouping variable g as reference (which will by default be alphabetically ordered).

Then you can compute the smallest sum of the ranks possible for the reference sample:

n1  <- table(w2@statistic@x)[1]

And

w2@statistic@linearstatistic-  n1*(n1+1)/2 == w1$statistic

should return TRUE

Voilà.

José Jiménez · Answer

It seems to be one is performing Mann-Whitney's U and the other Wilcoxon rank test, which is defined in many different ways in literature. They are pretty much equivalent, just look at the p-value. If you want continuity correction in wilcox.test just use argument correct=T.

Check https://stats.stackexchange.com/questions/79843/is-the-w-statistic-outputted-by-wilcox-test-in-r-the-same-as-the-u-statistic

coin::wilcox_test versus wilcox.test in R

Tags:

r

A.S.

2 Answers

G Chalancon

José Jiménez

Recent Activity

Donate For Us

coin::wilcox_test versus wilcox.test in R

Tags:

r

A.S.

2 Answers

G Chalancon

José Jiménez

Related questions

Recent Activity

Donate For Us