Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numpy & Pandas: Return histogram values from pandas histogram plot?

I know that I can plot histogram by pandas:

df4 = pd.DataFrame({'a': np.random.randn(1000) + 1})
df4['a'].hist()

enter image description here

But how can I retrieve the histogram count from such a plot?

I know I can do it by (from Histogram values of a Pandas Series)

count,division = np.histogram(df4['a'])

But get the count value after df.hist() using this feels very redundent. Is it possible to get the frequency value directly from pandas?

like image 712
cqcn1991 Avatar asked Jul 19 '16 06:07

cqcn1991


1 Answers

The quick answer is:

pd.cut(df4['a'], 10).value_counts().sort_index()

From the documentation:

bins: integer, default 10
Number of histogram bins to be used

So look at pd.cut(df4['a'], 10).value_counts()

You see that the values are the same as from np.histogram

like image 86
piRSquared Avatar answered Oct 09 '22 01:10

piRSquared