Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Populate Pandas Dataframe with normal distribution

I would like to populate a dataframe with numbers that follow a normal distribution. Currently I'm populating it randomly, but the distribution is flat. Column a has mean and sd of 5 and 1 respectively, and column b has mean and sd of 15 and 1.

import pandas as pd
import numpy as np

n = 10
df = pd.DataFrame(dict(
  a=np.random.randint(1,10,size=n),
  b=np.random.randint(100,110,size=n)
))
like image 918
nonegiven72 Avatar asked Feb 17 '26 07:02

nonegiven72


2 Answers

Try this. randint does not select from normal dist. normal does. Also no idea where you came up with 100 and 110 in min and max args for b.

n = 10
a_bar = 5; a_sd = 1
b_bar = 15; b_sd = 1
df = pd.DataFrame(dict(a=np.random.normal(a_bar, a_sd, size=n),
                       b=np.random.normal(b_bar, b_sd, size=n)),
                  columns=['a', 'b'])
like image 182
Pete Avatar answered Feb 19 '26 19:02

Pete


This should work;

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

n = 200
df = pd.DataFrame(dict(
  a=np.random.normal(1,10,size=n),
  b=np.random.normal(100,110,size=n)
))

plt.style.use("ggplot")
fig, ax = plt.subplots()
ax.plot(df["a"])
ax.plot(df["b"], color="b")
plt.show()
plt.clf()

Generated Plot

like image 26
Josua Avatar answered Feb 19 '26 21:02

Josua



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!