Getting the parameter names of scipy.stats distributions

Question

I am writing a script to find the best-fitting distribution over a dataset using scipy.stats. I first have a list of distribution names, over which I iterate:

dists = ['alpha', 'anglit', 'arcsine', 'beta', 'betaprime', 'bradford', 'norm']
for d in dists:
    dist = getattr(scipy.stats, d)
    ps = dist.fit(selected_data)
    errors.loc[d,['D-Value','P-Value']] = kstest(selected.tolist(), d, args=ps)
    errors.loc[d,'Params'] = ps

Now, after this loop, I select the minimum D-Value in order to get the best fitting distribution. Now, each distribution returns a specific set of parameters in ps, each with their names and so on (for instance, for 'alpha' it would be alpha, whereas for 'norm' they would be mean and std).

Is there a way to get the names of the estimated parameters in scipy.stats?

Thank you in advance

Adam Erickson · Accepted Answer

Warren Weckesser and I have developed a more robust solution:

import sys
import scipy.stats

def list_parameters(distribution):
    """List parameters for scipy.stats.distribution.
    # Arguments
        distribution: a string or scipy.stats distribution object.
    # Returns
        A list of distribution parameter strings.
    """
    if isinstance(distribution, str):
        distribution = getattr(scipy.stats, distribution)
    if distribution.shapes:
        parameters = [name.strip() for name in distribution.shapes.split(',')]
    else:
        parameters = []
    if distribution.name in scipy.stats._discrete_distns._distn_names:
        parameters += ['loc']
    elif distribution.name in scipy.stats._continuous_distns._distn_names:
        parameters += ['loc', 'scale']
    else:
        sys.exit("Distribution name not found in discrete or continuous lists.")
    return parameters

The discussion can be found here.

Bill Bell · Answer

This code demonstrates the information that ev-br gave in his answer in case anyone else lands here.

>>> from scipy import stats
>>> dists = ['alpha', 'anglit', 'arcsine', 'beta', 'betaprime', 'bradford', 'norm']
>>> for d in dists:
...     dist = getattr(scipy.stats, d)
...     dist.name, dist.shapes
... 
('alpha', 'a')
('anglit', None)
('arcsine', None)
('beta', 'a, b')
('betaprime', 'a, b')
('bradford', 'c')
('norm', None)

I would point out that the shapes parameter yields a value of None for distributions such as the normal which are parameterised by location and scale.

Getting the parameter names of scipy.stats distributions

Tags:

python

numpy

scipy

user1695639

2 Answers

Adam Erickson

Bill Bell

Recent Activity

Donate For Us

Getting the parameter names of scipy.stats distributions

Tags:

python

numpy

scipy

user1695639

2 Answers

Adam Erickson

Bill Bell

Related questions

Recent Activity

Donate For Us