Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding the First and Third Quartiles

Tags:

c#

math

What I want to do is get the middle of each half of my number. So what I have already created is a way to get the middle of the number (The median in math terms) here;

    public static String Find_Median()
    {
        double Size = list.Count;
        double Final_Number = 0;
        if (Size % 2 == 0)
        {
            int HalfWay = list.Count / 2;
            double Value1 = Convert.ToDouble(list[HalfWay - 1].ToString());
            double Value2 = Convert.ToDouble(list[HalfWay - 1 + 1].ToString());
            double Number = Value1 + Value2;
            Final_Number = Number / 2;
        }
        else
        {
            int HalfWay = list.Count / 2;
            double Value1 = Convert.ToDouble(list[HalfWay].ToString());
            Final_Number = Value1;
        }
        return Convert.ToString(Final_Number);
    }

That gets the exact middle number of all the numbers in the list, even if its got to middle it does that math also. I want to do that on both sides; here's an example;

3 2 1 4 5 6

The middle (median) of that list is 3.5. I want to use math to find 2, which is between the start and the middle of the equation. also known as Q1 in the IQR. I also want to know how I can find the middle number between the median (middle) and the end, which is 5.

enter image description here

I.E. So i can find 70,80 and 90 with code.

like image 240
Metab Avatar asked Feb 04 '13 09:02

Metab


People also ask

How do you find 1st 2nd and 3rd quartiles?

First Quartile(Q1)=((n+1)/4)th Term also known as the lower quartile. The second quartile or the 50th percentile or the Median is given as: Second Quartile(Q2)=((n+1)/2)th Term. The third Quartile of the 75th Percentile (Q3) is given as: Third Quartile(Q3)=(3(n+1)/4)th Term also known as the upper quartile.

How do you find the Q1 Q3 and Iqr?

The IQR describes the middle 50% of values when ordered from lowest to highest. To find the interquartile range (IQR), ​first find the median (middle value) of the lower and upper half of the data. These values are quartile 1 (Q1) and quartile 3 (Q3). The IQR is the difference between Q3 and Q1.


1 Answers

I just ran into the same issue, and checking the wikipedia entry for Quartile, it's a bit more complex than it first appears.

My approach was as follows: (which seems to work pretty well for all cases, N=1 on up)...

 /// <summary>
/// Return the quartile values of an ordered set of doubles
///   assume the sorting has already been done.
///   
/// This actually turns out to be a bit of a PITA, because there is no universal agreement 
///   on choosing the quartile values. In the case of odd values, some count the median value
///   in finding the 1st and 3rd quartile and some discard the median value. 
///   the two different methods result in two different answers.
///   The below method produces the arithmatic mean of the two methods, and insures the median
///   is given it's correct weight so that the median changes as smoothly as possible as 
///   more data ppints are added.
///    
/// This method uses the following logic:
/// 
/// ===If there are an even number of data points:
///    Use the median to divide the ordered data set into two halves. 
///    The lower quartile value is the median of the lower half of the data. 
///    The upper quartile value is the median of the upper half of the data.
///    
/// ===If there are (4n+1) data points:
///    The lower quartile is 25% of the nth data value plus 75% of the (n+1)th data value.
///    The upper quartile is 75% of the (3n+1)th data point plus 25% of the (3n+2)th data point.
///    
///===If there are (4n+3) data points:
///   The lower quartile is 75% of the (n+1)th data value plus 25% of the (n+2)th data value.
///   The upper quartile is 25% of the (3n+2)th data point plus 75% of the (3n+3)th data point.
/// 
/// </summary>
internal Tuple<double, double, double> Quartiles(double[] afVal)
{
    int iSize = afVal.Length;
    int iMid = iSize / 2; //this is the mid from a zero based index, eg mid of 7 = 3;

    double fQ1 = 0;
    double fQ2 = 0;
    double fQ3 = 0;

    if (iSize % 2 == 0)
    {
        //================ EVEN NUMBER OF POINTS: =====================
        //even between low and high point
        fQ2 = (afVal[iMid - 1] + afVal[iMid]) / 2;

        int iMidMid = iMid / 2;

        //easy split 
        if (iMid % 2 == 0)
        {
            fQ1 = (afVal[iMidMid - 1] + afVal[iMidMid]) / 2;
            fQ3 = (afVal[iMid + iMidMid - 1] + afVal[iMid + iMidMid]) / 2;
        }
        else
        {
            fQ1 = afVal[iMidMid];
            fQ3 = afVal[iMidMid + iMid];
        }
    }
    else if (iSize == 1)
    {
        //================= special case, sorry ================
        fQ1 = afVal[0];
        fQ2 = afVal[0];
        fQ3 = afVal[0];
    }
    else
    {
        //odd number so the median is just the midpoint in the array.
        fQ2 = afVal[iMid];

        if ((iSize - 1) % 4 == 0)
        {
            //======================(4n-1) POINTS =========================
            int n = (iSize - 1) / 4;
            fQ1 = (afVal[n - 1] * .25) + (afVal[n] * .75);
            fQ3 = (afVal[3 * n] * .75) + (afVal[3 * n + 1] * .25);
        }
        else if ((iSize - 3) % 4 == 0)
        {
            //======================(4n-3) POINTS =========================
            int n = (iSize - 3) / 4;

            fQ1 = (afVal[n] * .75) + (afVal[n + 1] * .25);
            fQ3 = (afVal[3 * n + 1] * .25) + (afVal[3 * n + 2] * .75);
        }
    }

    return new Tuple<double, double, double>(fQ1, fQ2, fQ3);
}

THERE ARE MANY WAYS TO CALCULATE QUARTILES:

I did my best here to implement the version of Quartiles as described as type = 8 Quartile(array, type=8) in the R documentation: https://www.rdocumentation.org/packages/stats/versions/3.5.1/topics/quantile. This method, is preferred by the authors of the R function, described here, as it produces a smoother transition between values. However, R defaults to method 7, which is the same function used by S and Excel.

If you are just Googling for answers and not thinking about what the output means, or what result you are trying to achieve, this might give you a surprise.

like image 85
mike Avatar answered Oct 05 '22 12:10

mike