Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the meaning of mu, loc and size in the scipy.stats.poisson?

This Poisson doc page explains the function. The problem is that if you are not familiar with these, you can't understand what they mean. For example, I want to know where to put the mean, where the standard deviation, and where the sample size. It says that mu is a shape parameter. This doesn't help me.

In this example:

np.random.seed(6)

population_ages1 = stats.poisson.rvs(loc=18, mu=35, size=150000)
population_ages2 = stats.poisson.rvs(loc=18, mu=10, size=100000)
population_ages = np.concatenate((population_ages1, population_ages2))

minnesota_ages1 = stats.poisson.rvs(loc=18, mu=30, size=30)
minnesota_ages2 = stats.poisson.rvs(loc=18, mu=10, size=20)
minnesota_ages = np.concatenate((minnesota_ages1, minnesota_ages2))

print( population_ages.mean() )
print( minnesota_ages.mean() )

Output:

43 39

What do loc, mu and size stand for?

like image 325
Uhxw Avatar asked Jan 23 '18 19:01

Uhxw


2 Answers

These are documented well enough in the common literature: location, mu, and the page you cited -- "well enough" is assuming that you're familiar enough with the field's vocabulary to work your way through the technical docs.

  • loc is the N-dimensional reference point of the distribution, that centroid being chosen appropriately to the function. For this application, it's simply the left end of the desired distribution (scalar). This defaults to 0, and is only changed if your application starts at something other than 0.
  • mu is the mean of the function.
  • size is the sample size.

The Poisson distribution has only the one shape parameter: mu. The variance, mean, and frequency are lock-stepped to each other.

like image 81
Prune Avatar answered Sep 29 '22 01:09

Prune


UHXW is asking what do these arguments mean in simple terms. Prune's answers could be simplified.

The loc is like the lowest x value of your distribution the mu is like the middle of your distribution. Look at https://www.datacamp.com/community/tutorials/probability-distributions-python

The uniform function generates a uniform continuous variable between the specified interval via its loc and scale arguments. This distribution is constant between loc and loc + scale. The size arguments describe the number of random variates. If you want to maintain reproducibility, include a random_state argument assigned to a number.

like image 35
normandantzig Avatar answered Sep 29 '22 01:09

normandantzig