Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Label outliers using mvOutlier from MVN in R

Tags:

r

label

outliers

I'm trying to label outliers on a Chi-square Q-Q plot using mvOutlier() function of the MVN package in R.

I have managed to identify the outliers by their labels and get their x-coordinates. I tried placing the former on the plot using text(), but the x- and y-coordinates seem to be flipped.

Building on an example from the documentation:

library(MVN)
data(iris)
versicolor <- iris[51:100, 1:3]
# Mahalanobis distance
result <- mvOutlier(versicolor, qqplot = TRUE, method = "quan")
labelsO<-rownames(result$outlier)[result$outlier[,2]==TRUE]
xcoord<-result$outlier[result$outlier[,2]==TRUE,1]
text(xcoord,label=labelsO)

This produces the following: Resulting plot

I also tried text(x = xcoord, y = xcoord,label = labelsO), which is fine when the points are near the y = x line, but might fail when normality is not satisfied (and the points deviate from this line).

Can someone suggest how to access the Chi-square quantiles or why the x-coordinate of the text() function doesn't seem to obey the input parameters.

like image 560
Fato39 Avatar asked Apr 27 '15 20:04

Fato39


2 Answers

Looking inside the mvOutlier function, it looks like it doesn't save the chi-squared values. Right now your text code is treating xcoord as a y-value, and assumes that the actual x value is 1:2. Thankfully the chi-squared value is a fairly simple calculation, as it is rank-based in this case.

result <- mvOutlier(versicolor, qqplot = TRUE, method = "quan")
labelsO<-rownames(result$outlier)[result$outlier[,2]==TRUE]
xcoord<-result$outlier[result$outlier[,2]==TRUE,1]
#recalculate chi-squared values for ranks 50 and 49 (i.e., p=(size:(size-n.outliers + 1))-0.5)/size and df = n.variables = 3
chis = qchisq(((50:49)-0.5)/50,3)
text(xcoord,chis,label=labelsO)
like image 107
Max Candocia Avatar answered Sep 24 '22 22:09

Max Candocia


As it is mentioned in the previous reply, MVN packge does not support to label outliers. Although it is not really necessary since it can be done manually, we still might consider to add "labeling outliers" option within mvOutlier(...) function. Thanks for your interest indeed. We might include it in the following updates of the package.

like image 37
dnc.R Avatar answered Sep 22 '22 22:09

dnc.R