Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you use scipy.stats.rv_continuous?

I have been looking for a good tutorial or examples of how to use rv_continuous and I have not been able to find one.

I read:

http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rv_continuous.html#scipy.stats.rv_continuous

but it was not really all that helpful (and it lacked any examples of how to use it).

An example of something that I wanted to be able to do is to, specify any probability distributions and being able to call fit and then just simply having the pdf that I wanted and be able to call expect and get the desired expected value.

The thing I understand so far is that to create any probably distribution, we need to create our own class for it and then subclass rv_continuous. Then by specifying a custom _pdf or _cdf we should be able to simply use every method that rv_continuous would provide for us. Like expect and fit should be available now.

However, the thing that is really mysterious for me is, if we don't tell rv_continuous explicitly what the parameters are that specify the probability distribution, is it really able to do all those methods correctly? How does it even do it just with _pdf or _cdf?

Or did I just misunderstand how it works?

Also, if you can provide a simple example of how it works and how to use expect and/or fit, it would be awesome! Or maybe a better tutorial or link it would cool.

Thanks in Advance.

like image 847
Charlie Parker Avatar asked Mar 17 '14 05:03

Charlie Parker


1 Answers

Here's a tutorial: http://docs.scipy.org/doc/scipy/reference/tutorial/stats.html

Basically, rv_continuous is made for subclassing. Use it if you need a distribution which is not defined in scipy.stats (there are more than 70 of them).

Re how it works. In a nutshell, it uses generic code paths: if your subclass defines _pdf and does not define _logpdf, then it inherits

def _logpdf(self, x, *args):
    return log(self._pdf(x, *args))

and a bunch of similar methods (see https://github.com/scipy/scipy/blob/master/scipy/stats/_distn_infrastructure.py for precise details).

Re parameters. You probably mean shape parameters, do you? They are inferred automagically by inspecting the signature of _pdf or _cdf, see https://github.com/scipy/scipy/blob/master/scipy/stats/_distn_infrastructure.py#L617. If you want to bypass the inspection, provide shapes parameter to the constructor of your instance:

class Mydist(stats.rv_continuous):
    def _pdf(self, x, a, b, c, d):
       return 42
mydist = Mydist(shapes='a, b, c, d')

[Strictly speaking, this only applies to scipy 0.13 and above. Earlier versions were using a different mechanism and required the shapes attribute.]

like image 154
ev-br Avatar answered Nov 15 '22 12:11

ev-br