Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to detect outliers in an ArrayList

I'm trying to think of some code that will allow me to search through my ArrayList and detect any values outside the common range of "good values."

Example: 100 105 102 13 104 22 101

How would I be able to write the code to detect that (in this case) 13 and 22 don't fall within the "good values" of around 100?

like image 906
Ashton Avatar asked Sep 14 '13 18:09

Ashton


People also ask

How can outliers be detected?

Graphing Your Data to Identify Outliers. Boxplots, histograms, and scatterplots can highlight outliers. Boxplots display asterisks or other symbols on the graph to indicate explicitly when datasets contain outliers. These graphs use the interquartile method with fences to find outliers, which I explain later.

What is the 1.5 rule for outliers?

Any observations that are more than 1.5 IQR below Q1 or more than 1.5 IQR above Q3 are considered outliers. This is the method that Minitab uses to identify outliers by default.

How do you identify outliers in a column?

Finding outliers using statistical methodsUsing the IQR, the outlier data points are the ones falling below Q1–1.5 IQR or above Q3 + 1.5 IQR. The Q1 is the 25th percentile and Q3 is the 75th percentile of the dataset, and IQR represents the interquartile range calculated by Q3 minus Q1 (Q3–Q1).

How do you filter out outliers?

When you decide to remove outliers, document the excluded data points and explain your reasoning. You must be able to attribute a specific cause for removing outliers. Another approach is to perform the analysis with and without these observations and discuss the differences.


1 Answers

There are several criteria for detecting outliers. The simplest ones, like Chauvenet's criterion, use the mean and standard deviation calculated from the sample to determine a "normal" range for values. Any value outside of this range is deemed an outlier.

Other criterions are Grubb's test and Dixon's Q test and may give better results than Chauvenet's for example if the sample comes from a skew distribution.

like image 87
Joni Avatar answered Sep 20 '22 06:09

Joni