Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the point of norm.fit in scipy?

Im generating a random sample of data and plotting its pdf using scipy.stats.norm.fit to generate my loc and scale parameters.

I wanted to see how different my pdf would look like if I just calculated the mean and std using numpy without any actual fitting. To my surprise when I plot both pdfs and print both sets of mu and std the results I get are exactly the same. So my question is, what is the point of norm.fit if I can just calculate the mean and std of my sample and still get the same results?

This is my code:

import numpy as np
from scipy.stats import norm
import matplotlib.pyplot as plt

data = norm.rvs(loc=0,scale=1,size=200)

mu1 = np.mean(data)

std1 = np.std(data)

print(mu1)
print(std1)

mu, std = norm.fit(data)

plt.hist(data, bins=25, density=True, alpha=0.6, color='g')

xmin, xmax = plt.xlim()
x = np.linspace(xmin, xmax, 100)
p = norm.pdf(x, mu, std)
q = norm.pdf(x, mu1, std1)
plt.plot(x, p, 'k', linewidth=2)
plt.plot(x, q, 'r', linewidth=1)
title = "Fit results: mu = %.5f,  std = %.5f" % (mu, std)
plt.title(title)

plt.show()

And this is the results I got:

Pdf of a random set of values

mu1 = 0.034824979915482716

std1 = 0.9945453455908072

like image 325
José Manuel Valladares Avatar asked Mar 26 '20 06:03

José Manuel Valladares


People also ask

What is norm in SciPy stats?

A normal continuous random variable. The location ( loc ) keyword specifies the mean.

What is SciPy stats norm PPF?

The method norm. ppf() takes a percentage and returns a standard deviation multiplier for what value that percentage occurs at. It is equivalent to a, 'One-tail test' on the density plot. From scipy. stats.

What is the purpose of the LOC parameter for the norm cdf function from the SciPy stats module?

The location (loc) keyword specifies the mean. The scale (scale) keyword specifies the standard deviation. Frozen RV object with the same methods but holding the given shape, location, and scale fixed.

What does Norm cdf do python?

The easiest way to calculate normal CDF probabilities in Python is to use the norm. cdf() function from the SciPy library. What is this? The probability that a random variables takes on a value less than 1.96 in a standard normal distribution is roughly 0.975.


1 Answers

The point is that there are several other distributions out there besides the normal distribution. Scipy provides a consistent API for learning the parameters of these distributions from data. (Want an exponential distribution instead of a normal distribution? It’s scipy.stats.expon.fit.)

So sure, your way also works because the parameters of the normal distribution happen to be the mean and standard deviation. But this is about providing a consistent interface across distributions, including ones where that’s not true.

like image 144
Arya McCarthy Avatar answered Sep 18 '22 15:09

Arya McCarthy