Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Plotting histograms against classes in pandas / matplotlib

Tags:

Is there a idiomatic way to plot the histogram of a feature for two classes? In pandas, I basically want

df.feature[df.class == 0].hist() df.feature[df.class == 1].hist() 

To be in the same plot. I could do

df.feature.hist(by=df.class) 

but that gives me two separate plots.

This seems to be a common task so I would imagine there to be an idiomatic way to do this. Of course I could manipulate the histograms manually to fit next to each other but usually pandas does that quite nicely.

Basically I want this matplotlib example in one line of pandas: http://matplotlib.org/examples/pylab_examples/barchart_demo.html

I thought I was missing something, but maybe it is not possible (yet).

like image 495
Andreas Mueller Avatar asked Feb 04 '14 09:02

Andreas Mueller


People also ask

How do you plot a histogram in Python pandas?

In order to plot a histogram using pandas, chain the . hist() function to the dataframe. This will return the histogram for each numeric column in the pandas dataframe.

How do I make my Matplotlib histogram look better?

Tweaking Matplotlib Preferably, one that has tick mark and other features closer to the aesthetic you want to achieve. Turn the frame and grid lines off. Tweak the x-axis so that there is a gap with the y-axis, which seems more appropriate for histograms. Have color options allowing for separation between bins.


1 Answers

How about df.groupby("class").feature.hist()? To see overlapping distributions you'll probably need to pass alpha=0.4 to hist(). Alternatively, I'd be tempted to use a kernel density estimate instead of a histogram with df.groupby("class").feature.plot(kind='kde').

As an example, I plotted the iris dataset's classes using:

iris.groupby("Name").PetalWidth.plot(kind='kde', ax=axs[1]) iris.groupby("Name").PetalWidth.hist(alpha=0.4, ax=axs[0]) 

enter image description here

like image 109
jmz Avatar answered Sep 29 '22 19:09

jmz