Equivalent python command for quantile in matlab

Tags:

I'm trying to replicate some Matlab code in python. I could not find an exact equivalent to the Matlab function quantile. What I found most close is python's mquantiles.

Matlab example:

 quantile( [ 8.60789925e-05, 1.98989354e-05 , 1.68308882e-04, 1.69379370e-04],  0.8)

...gives: 0.00016958

Same example in python:

scipy.stats.mstats.mquantiles( [8.60789925e-05, 1.98989354e-05, 1.68308882e-04, 1.69379370e-04], 0.8)

...gives 0.00016912

Does anyone know how to exactly replicate Matlab's quantile function?

894

asked Dec 05 '12 21:12

Jack2019

2 Answers

The documentation for quantile (under the More About => Algorithms section) gives the exact algorithm used. Here's some python code that does it for a single quantile for a flat array, using bottleneck to do partial sorting:

import numpy as np
import botteleneck as bn

def quantile(a, prob):
    """
    Estimates the prob'th quantile of the values in a data array.

    Uses the algorithm of matlab's quantile(), namely:
        - Remove any nan values
        - Take the sorted data as the (.5/n), (1.5/n), ..., (1-.5/n) quantiles.
        - Use linear interpolation for values between (.5/n) and (1 - .5/n).
        - Use the minimum or maximum for quantiles outside that range.

    See also: scipy.stats.mstats.mquantiles
    """
    a = np.asanyarray(a)
    a = a[np.logical_not(np.isnan(a))].ravel()
    n = a.size

    if prob >= 1 - .5/n:
        return a.max()
    elif prob <= .5 / n:
        return a.min()

    # find the two bounds we're interpreting between:
    # that is, find i such that (i+.5) / n <= prob <= (i+1.5)/n
    t = n * prob - .5
    i = np.floor(t)

    # partial sort so that the ith element is at position i, with bigger ones
    # to the right and smaller to the left
    a = bn.partsort(a, i)

    if i == t: # did we luck out and get an integer index?
        return a[i]
    else:
        # we'll linearly interpolate between this and the next index
        smaller = a[i]
        larger = a[i+1:].min()
        if np.isinf(smaller):
            return smaller # avoid inf - inf
        return smaller + (larger - smaller) * (t - i)

I only did the single-quantile, 1d case because that's all I needed. If you want several quantiles, it's probably worth just doing the full sort; to do it per-axis and knew you didn't have any nans, all you should need to do is add an axis argument to the sort and vectorize the linear interpolation bit. Doing it per-axis with nans would be a little trickier.

This code gives:

>>> quantile([ 8.60789925e-05, 1.98989354e-05 , 1.68308882e-04, 1.69379370e-04], 0.8)
0.00016905822360000001

and the matlab code gave 0.00016905822359999999; the difference is 3e-20. (which is less than machine precision)

156

answered Oct 07 '22 17:10

Danica

Your input vector only has 4 values, which is far too few to get a good approximation of the quantiles of the underlying distribution. The discrepancy is probably the result of Matlab and SciPy using different heuristics to compute quantiles on under sampled distributions.

answered Oct 07 '22 17:10

slayton

Related questions
                            
                                dict.pop or dict.get and evaluation
                            
                                Python input validity and using asserts
                            
                                Asynchronous sockets with select - Python
                            
                                Why do function annotations need a dedicated syntax in Python?
                            
                                Is it possible to apply my own decorators to builtin methods in Python?
                            
                                Need to process all files in a directory, but am only getting one
                            
                                Python regex uppercase unicode word
                            
                                Is it possible to get full URL(include domain) within Django template
                            
                                How to get all tasks and periodic tasks in Celery [duplicate]
                            
                                Walk through all XML nodes in an element-nested structure
                            
                                Python MySQLdb converters isn't working
                            
                                How can I put and get a set of multiple items in a queue?
                            
                                Python reading data file into lists or arrays
                            
                                QTableWidget Integer
                            
                                Performing gridsearch with python scikit-learn library on Amazon EC2 cluster
                            
                                Grouping in a list with sequence re-read
                            
                                Convert CSV to mongoimport-friendly JSON using Python
                            
                                how to create random single source random acyclic directed graphs with negative edge weights in python
                            
                                numpy only perform function on nonzero parts while preserving structure of array
                            
                                How do i visualize a connection Matrix with Matplotlib?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Equivalent python command for quantile in matlab

Tags:

python

matlab

quantile

Jack2019

People also ask

2 Answers

Danica

slayton

Recent Activity

Donate For Us