There is a method to plot Series histograms, but is there a function to retrieve the histogram counts to do further calculations on top of it? I keep using numpy's functions to do this and converting the result to a DataFrame or Series when I need this. It would be nice to stay with pandas objects the whole time.

If your Series was discrete you could use <code>value_counts</code>: <pre class="prettyprint"><code>In [11]: s = pd.Series([1, 1, 2, 1, 2, 2, 3]) In [12]: s.value_counts() Out[12]: 2 3 1 3 3 1 dtype: int64 </code></pre> You can see that <code>s.hist()</code> is essentially equivalent to <code>s.value_counts().plot()</code>. If it was of floats an awful hacky solution could be to use groupby: <pre class="prettyprint"><code>s.groupby(lambda i: np.floor(2*s[i]) / 2).count() </code></pre>

Are there functions to retrieve the histogram counts of a Series in pandas?

2 Answers

If your Series was discrete you could use value_counts:

In [11]: s = pd.Series([1, 1, 2, 1, 2, 2, 3])  In [12]: s.value_counts() Out[12]: 2    3 1    3 3    1 dtype: int64

You can see that s.hist() is essentially equivalent to s.value_counts().plot().

If it was of floats an awful hacky solution could be to use groupby:

s.groupby(lambda i: np.floor(2*s[i]) / 2).count()

141

answered Oct 09 '22 22:10

Andy Hayden

Since hist and value_counts don't use the Series' index, you may as well treat the Series like an ordinary array and use np.histogram directly. Then build a Series from the result.

In [4]: s = Series(randn(100))  In [5]: counts, bins = np.histogram(s)  In [6]: Series(counts, index=bins[:-1]) Out[6]:  -2.968575     1 -2.355032     4 -1.741488     5 -1.127944    26 -0.514401    23  0.099143    23  0.712686    12  1.326230     5  1.939773     0  2.553317     1 dtype: int32

This is a really convenient way to organize the result of a histogram for subsequent computation.

To index by the center of each bin instead of the left edge, you could use bins[:-1] + np.diff(bins)/2.

answered Oct 09 '22 23:10

Dan Allan

Related questions
                            
                                D3: Create a continuous color scale with many strings/inputs for the range and dynamically changing values of the domain
                            
                                Entity framework vs NHibernate - Performance
                            
                                ASP.NET MVC 4 Routes - controller/id vs controller/action/id
                            
                                Inserting a Column to preexisting table in phpmyadmin
                            
                                In Matlab how do I change the arrow head style in quiver plot?
                            
                                Weblogic datasource disappears from JNDI tree
                            
                                PHPUnit testing with closures
                            
                                How to track the current user in flask-login?
                            
                                "You tried to execute a query that does not include the specified aggregate function"
                            
                                How do I test $scope.$on in AngularJS
                            
                                how to properly use while loop in PDO fetchAll
                            
                                GCD vs performSelectorInBackground/performSelectorOnMainThread

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Are there functions to retrieve the histogram counts of a Series in pandas?

Tags:

Rafael S. Calsaverini

People also ask

2 Answers

Andy Hayden

Dan Allan

Recent Activity

Donate For Us