I have been trying to get the result of a lognormal distribution using Scipy. I already have the Mu and Sigma, so I don't need to do any other prep work. If I need to be more specific (and I am trying to be with my limited knowledge of stats), I would say that I am looking for the cumulative function (cdf under Scipy). The problem is that I can't figure out how to do this with just the mean and standard deviation on a scale of 0-1 (ie the answer returned should be something from 0-1). I'm also not sure which method from dist, I should be using to get the answer. I've tried reading the documentation and looking through SO, but the relevant questions (like this and this) didn't seem to provide the answers I was looking for. Here is a code sample of what I am working with. Thanks. <pre class="prettyprint"><code>from scipy.stats import lognorm stddev = 0.859455801705594 mean = 0.418749176686875 total = 37 dist = lognorm.cdf(total,mean,stddev) </code></pre> UPDATE: So after a bit of work and a little research, I got a little further. But I still am getting the wrong answer. The new code is below. According to R and Excel, the result should be .7434, but that's clearly not what is happening. Is there a logic flaw I am missing? <pre class="prettyprint"><code>dist = lognorm([1.744],loc=2.0785) dist.cdf(25) # yields=0.96374596, expected=0.7434 </code></pre> UPDATE 2: Working lognorm implementation which yields the correct 0.7434 result. <pre class="prettyprint"><code>def lognorm(self,x,mu=0,sigma=1): a = (math.log(x) - mu)/math.sqrt(2*sigma**2) p = 0.5 + 0.5*math.erf(a) return p lognorm(25,1.744,2.0785) > 0.7434 </code></pre>

I know this is a bit late (almost one year!) but I've been doing some research on the lognorm function in scipy.stats. A lot of folks seem confused about the input parameters, so I hope to help these people out. The example above is almost correct, but I found it strange to set the mean to the location ("loc") parameter - this signals that the cdf or pdf doesn't 'take off' until the value is greater than the mean. Also, the mean and standard deviation arguments should be in the form exp(Ln(mean)) and Ln(StdDev), respectively. Simply put, the arguments are (x, shape, loc, scale), with the parameter definitions below: loc - No equivalent, this gets subtracted from your data so that 0 becomes the infimum of the range of the data. scale - exp μ, where μ is the mean of the log of the variate. (When fitting, typically you'd use the sample mean of the log of the data.) shape - the standard deviation of the log of the variate. I went through the same frustration as most people with this function, so I'm sharing my solution. Just be careful because the explanations aren't very clear without a compendium of resources. For more information, I found these sources helpful: <ul> <li>http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.lognorm.html#scipy.stats.lognorm</li> <li>https://stats.stackexchange.com/questions/33036/fitting-log-normal-distribution-in-r-vs-scipy</li> </ul> And here is an example, taken from @serv-inc 's answer, posted on this page here: <pre class="prettyprint"><code>import math from scipy import stats # standard deviation of normal distribution sigma = 0.859455801705594 # mean of normal distribution mu = 0.418749176686875 # hopefully, total is the value where you need the cdf total = 37 frozen_lognorm = stats.lognorm(s=sigma, scale=math.exp(mu)) frozen_lognorm.cdf(total) # use whatever function and value you need here </code></pre>

How do I get a lognormal distribution in Python with Mu and Sigma?

Tags:

I have been trying to get the result of a lognormal distribution using Scipy. I already have the Mu and Sigma, so I don't need to do any other prep work. If I need to be more specific (and I am trying to be with my limited knowledge of stats), I would say that I am looking for the cumulative function (cdf under Scipy). The problem is that I can't figure out how to do this with just the mean and standard deviation on a scale of 0-1 (ie the answer returned should be something from 0-1). I'm also not sure which method from dist, I should be using to get the answer. I've tried reading the documentation and looking through SO, but the relevant questions (like this and this) didn't seem to provide the answers I was looking for.

Here is a code sample of what I am working with. Thanks.

from scipy.stats import lognorm stddev = 0.859455801705594 mean = 0.418749176686875 total = 37 dist = lognorm.cdf(total,mean,stddev)

UPDATE:

So after a bit of work and a little research, I got a little further. But I still am getting the wrong answer. The new code is below. According to R and Excel, the result should be .7434, but that's clearly not what is happening. Is there a logic flaw I am missing?

dist = lognorm([1.744],loc=2.0785) dist.cdf(25)  # yields=0.96374596, expected=0.7434

UPDATE 2: Working lognorm implementation which yields the correct 0.7434 result.

def lognorm(self,x,mu=0,sigma=1):    a = (math.log(x) - mu)/math.sqrt(2*sigma**2)    p = 0.5 + 0.5*math.erf(a)    return p lognorm(25,1.744,2.0785) > 0.7434

934

asked Jan 15 '12 15:01

Eric Lubow

2 Answers

I know this is a bit late (almost one year!) but I've been doing some research on the lognorm function in scipy.stats. A lot of folks seem confused about the input parameters, so I hope to help these people out. The example above is almost correct, but I found it strange to set the mean to the location ("loc") parameter - this signals that the cdf or pdf doesn't 'take off' until the value is greater than the mean. Also, the mean and standard deviation arguments should be in the form exp(Ln(mean)) and Ln(StdDev), respectively.

Simply put, the arguments are (x, shape, loc, scale), with the parameter definitions below:

loc - No equivalent, this gets subtracted from your data so that 0 becomes the infimum of the range of the data.

scale - exp μ, where μ is the mean of the log of the variate. (When fitting, typically you'd use the sample mean of the log of the data.)

shape - the standard deviation of the log of the variate.

I went through the same frustration as most people with this function, so I'm sharing my solution. Just be careful because the explanations aren't very clear without a compendium of resources.

For more information, I found these sources helpful:

http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.lognorm.html#scipy.stats.lognorm
https://stats.stackexchange.com/questions/33036/fitting-log-normal-distribution-in-r-vs-scipy

And here is an example, taken from @serv-inc 's answer, posted on this page here:

import math from scipy import stats  # standard deviation of normal distribution sigma = 0.859455801705594 # mean of normal distribution mu = 0.418749176686875 # hopefully, total is the value where you need the cdf total = 37  frozen_lognorm = stats.lognorm(s=sigma, scale=math.exp(mu)) frozen_lognorm.cdf(total) # use whatever function and value you need here

141

answered Oct 05 '22 11:10

modulitos

It sounds like you want to instantiate a "frozen" distribution from known parameters. In your example, you could do something like:

from scipy.stats import lognorm stddev = 0.859455801705594 mean = 0.418749176686875 dist=lognorm([stddev],loc=mean)

which will give you a lognorm distribution object with the mean and standard deviation you specify. You can then get the pdf or cdf like this:

import numpy as np import pylab as pl x=np.linspace(0,6,200) pl.plot(x,dist.pdf(x)) pl.plot(x,dist.cdf(x))

lognorm cdf and pdf

Is this what you had in mind?

answered Oct 05 '22 13:10

talonmies

Related questions
                            
                                denyhosts keeps adding back my IP
                            
                                JQuery UI Autocomplete (1.8) scroll
                            
                                Custom UIScrollView paging with scrollViewWillEndDragging
                            
                                PHP Email sending BCC
                            
                                Displaying git branch name in prompt does not work in screen
                            
                                Jasper Reports Show "Page X of Y" using a single text field
                            
                                Arbitrary precision of square roots
                            
                                Pymongo, query on list field, and/or
                            
                                Get UIWebView content's Height
                            
                                azure blob storage "No valid combination of account information found"
                            
                                Why do we need GROUP BY with AGGREGATE FUNCTIONS?
                            
                                Why do shaders have to be in html file for webgl program?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With