I've just started with Julia and I am trying to do some simple statistics.
I'm using the StatsBase package and am trying to calculate quantiles.
using StatsBase
lst = 1:10
print(nquantile(lst, 4))
and get
[1.0, 3.25, 5.5, 7.75, 10.0]
Where I assume Q_1 = 3.25 and Q_2 = 7.75
Running a similar code on python:
from statistics import quantiles
lst = [_ for _ in range(1, 11)]
print(quantiles(lst))
yields:
[2.75, 5.5, 8.25]
Where Q_1 = 2.75 and Q_3 = 8.25.
According to my understanding of statistics, pythons results correspond to what the actual math is.
So, What I am guessing is that the Julia variant is using some kind of gaussian distribution to find the quantiles. If so, is there a way to make this follow uniform distribution?
There are many quantile definitions and Julia implements all options found in Hyndman, R.J and Fan, Y. (1996) Sample Quantiles in Statistical Packages", The American Statistician, Vol. 50, No. 4, pp. 361-365
In order to get the Python equivalent do:
julia> quantile(1:10, (0:4)/4; alpha=0,beta=0)
5-element Vector{Float64}:
1.0
2.75
5.5
8.25
10.0
Explanation (found in docs):
help?> nquantile
(...)
Equivalent to quantile(x, [0:n]/n).
(...)
help?> quantile
quantile(itr, p; sorted=false, alpha::Real=1.0, beta::Real=alpha)
(...)
By default (alpha = beta = 1), quantiles are computed via linear interpolation between the points ((k-1)/(n-1),
v[k]), for k = 1:n where n = length(itr). This corresponds to Definition 7 of Hyndman and Fan (1996), and is the same as the R and NumPy default.
(...)
• Def. 6: alpha=0, beta=0 (Excel PERCENTILE.EXC, Python default, Stata altdef)
(...)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With