Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I get a lognormal distribution in Python with Mu and Sigma?

Tags:

I have been trying to get the result of a lognormal distribution using Scipy. I already have the Mu and Sigma, so I don't need to do any other prep work. If I need to be more specific (and I am trying to be with my limited knowledge of stats), I would say that I am looking for the cumulative function (cdf under Scipy). The problem is that I can't figure out how to do this with just the mean and standard deviation on a scale of 0-1 (ie the answer returned should be something from 0-1). I'm also not sure which method from dist, I should be using to get the answer. I've tried reading the documentation and looking through SO, but the relevant questions (like this and this) didn't seem to provide the answers I was looking for.

Here is a code sample of what I am working with. Thanks.

from scipy.stats import lognorm stddev = 0.859455801705594 mean = 0.418749176686875 total = 37 dist = lognorm.cdf(total,mean,stddev) 

UPDATE:

So after a bit of work and a little research, I got a little further. But I still am getting the wrong answer. The new code is below. According to R and Excel, the result should be .7434, but that's clearly not what is happening. Is there a logic flaw I am missing?

dist = lognorm([1.744],loc=2.0785) dist.cdf(25)  # yields=0.96374596, expected=0.7434 

UPDATE 2: Working lognorm implementation which yields the correct 0.7434 result.

def lognorm(self,x,mu=0,sigma=1):    a = (math.log(x) - mu)/math.sqrt(2*sigma**2)    p = 0.5 + 0.5*math.erf(a)    return p lognorm(25,1.744,2.0785) > 0.7434 
like image 934
Eric Lubow Avatar asked Jan 15 '12 15:01

Eric Lubow


People also ask

How do you generate a lognormal distribution in Python?

You can use the lognorm() function from the SciPy library in Python to generate a random variable that follows a log-normal distribution.

What is Mu and Sigma in lognormal distribution?

[ m , v ] = lognstat( mu , sigma ) returns the mean and variance of the lognormal distribution with the distribution parameters mu (mean of logarithmic values) and sigma (standard deviation of logarithmic values).

What are the two parameters of a lognormal distribution?

The lognormal distribution has two parameters, μ, and σ. These are not the same as mean and standard deviation, which is the subject of another post, yet they do describe the distribution, including the reliability function. Where Φ is the standard normal cumulative distribution function, and t is time.


2 Answers

I know this is a bit late (almost one year!) but I've been doing some research on the lognorm function in scipy.stats. A lot of folks seem confused about the input parameters, so I hope to help these people out. The example above is almost correct, but I found it strange to set the mean to the location ("loc") parameter - this signals that the cdf or pdf doesn't 'take off' until the value is greater than the mean. Also, the mean and standard deviation arguments should be in the form exp(Ln(mean)) and Ln(StdDev), respectively.

Simply put, the arguments are (x, shape, loc, scale), with the parameter definitions below:

loc - No equivalent, this gets subtracted from your data so that 0 becomes the infimum of the range of the data.

scale - exp μ, where μ is the mean of the log of the variate. (When fitting, typically you'd use the sample mean of the log of the data.)

shape - the standard deviation of the log of the variate.

I went through the same frustration as most people with this function, so I'm sharing my solution. Just be careful because the explanations aren't very clear without a compendium of resources.

For more information, I found these sources helpful:

  • http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.lognorm.html#scipy.stats.lognorm
  • https://stats.stackexchange.com/questions/33036/fitting-log-normal-distribution-in-r-vs-scipy

And here is an example, taken from @serv-inc 's answer, posted on this page here:

import math from scipy import stats  # standard deviation of normal distribution sigma = 0.859455801705594 # mean of normal distribution mu = 0.418749176686875 # hopefully, total is the value where you need the cdf total = 37  frozen_lognorm = stats.lognorm(s=sigma, scale=math.exp(mu)) frozen_lognorm.cdf(total) # use whatever function and value you need here 
like image 141
modulitos Avatar answered Oct 05 '22 11:10

modulitos


It sounds like you want to instantiate a "frozen" distribution from known parameters. In your example, you could do something like:

from scipy.stats import lognorm stddev = 0.859455801705594 mean = 0.418749176686875 dist=lognorm([stddev],loc=mean) 

which will give you a lognorm distribution object with the mean and standard deviation you specify. You can then get the pdf or cdf like this:

import numpy as np import pylab as pl x=np.linspace(0,6,200) pl.plot(x,dist.pdf(x)) pl.plot(x,dist.cdf(x)) 

lognorm cdf and pdf

Is this what you had in mind?

like image 36
talonmies Avatar answered Oct 05 '22 13:10

talonmies