The standard stats::kruskal.test module allows to calculate the kruskal-wallis test on a dataset:
>>> data(diamonds)
>>> kruskal.test(price~carat, data=diamonds)
Kruskal-Wallis rank sum test
data: price by carat by color
Kruskal-Wallis chi-squared = 50570.15, df = 272, p-value < 2.2e-16
This is correct, it is giving me a probability that all the groups in the data have the same mean.
However, I would like to have the details for each pair comparison, like if diamonds of colors D and E have the same mean price, as some other softwares do (SPSS) when you ask for a Kruskal test.
I have found kruskalmc from the package pgirmess which allows me to do what I want to do:
> kruskalmc(diamonds$price, diamonds$color)
Multiple comparison test after Kruskal-Wallis
p.value: 0.05
Comparisons
obs.dif critical.dif difference
D-E 571.7459 747.4962 FALSE
D-F 2237.4309 751.5684 TRUE
D-G 2643.1778 726.9854 TRUE
D-H 4539.4392 774.4809 TRUE
D-I 6002.6286 862.0150 TRUE
D-J 8077.2871 1061.7451 TRUE
E-F 2809.1767 680.4144 TRUE
E-G 3214.9237 653.1587 TRUE
E-H 5111.1851 705.6410 TRUE
E-I 6574.3744 800.7362 TRUE
E-J 8649.0330 1012.6260 TRUE
F-G 405.7470 657.8152 FALSE
F-H 2302.0083 709.9533 TRUE
F-I 3765.1977 804.5390 TRUE
F-J 5839.8562 1015.6357 TRUE
G-H 1896.2614 683.8760 TRUE
G-I 3359.4507 781.6237 TRUE
G-J 5434.1093 997.5813 TRUE
H-I 1463.1894 825.9834 TRUE
H-J 3537.8479 1032.7058 TRUE
I-J 2074.6585 1099.8776 TRUE
However, this package only allows for one categoric variable (e.g. I can't study the prices clustered by color and by carat, as I can do with kruskal.test), and I don't know anything about the pgirmess package, whether it is maintained or not, or if it is tested.
Can you recommend me a package to execute the Kruskal-Wallis test which returns details for every comparison? How would you handle the problem?
A Kruskal-Wallis test is used to determine whether or not there is a statistically significant difference between the medians of three or more independent groups. It is considered to be the non-parametric equivalent of the One-Way ANOVA.
The most common post-hoc tests after a significant Kruskal-Wallis test are: Dunn test. Conover test.
The test assumes that the observations are independent. That is, it is not appropriate for paired observations or repeated measures data. It is performed with the kruskal. test function in the native stats package.
Pairwise Comparison Steps:Compute a mean difference for each pair of variables. Find the critical mean difference. Compare each calculated mean difference to the critical mean. Decide whether to retain or reject the null hypothesis for that pair of means.
One other approach besides kruskal::agricolae mentioned by Marek, is the Nemenyi-Damico-Wolfe-Dunn test implemented in the help page for oneway_test in the coin package that uses multcomp. Using hadley's setup and reducing the B= value for the approximate() function so it finishes in finite time:
#updated translation of help page implementation of NDWD
NDWD <-
independence_test(dv ~ iv, data = sum_codings1, distribution = approximate(B = 10000),
ytrafo = function(data) trafo(data, numeric_trafo = rank_trafo),
xtrafo = mcp_trafo(iv = "Tukey"))
### global p-value
print(pvalue(NDWD))
### sites (I = II) != (III = IV) at alpha = 0.01 (page 244)
print(pvalue(NDWD, method = "single-step"))
More stable results on that larger dataset may require increasing the B value and increasing the user's patience.
Jan: 2012: There was recently a posting on R-help claiming unexpected results from this method so I forwarded that email to the maintainer. Mark Difford said he had confirmed the problems and offered an alternate tests with the nparcomp package: https://stat.ethz.ch/pipermail/r-help/2012-January/300100.html
There were also in the same week a couple of other suggestions on rhelp for post-hoc contrasts to KW tests:
kruskalmc suggested by Mario Garrido Escudero and
rms::polr
followed by rms::contrasts
suggested by Frank Harrell https://stat.ethz.ch/pipermail/r-help/2012-January/300329.html
Nov 2015: Agree with toto_tico that help page code of coin package has been changed in the intervening years. The ?independence_test
help page now offers a multivariate-KW test and the ?oneway_test
help page has replace its earlier implementation with the code above usng the independence_test
function.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With