Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Plotting a dataframe as both a 'hist' and 'kde' on the same plot

I have a pandas dataframe with user information. I would like to plot the age of users as both a kind='kde' and on kind='hist' on the same plot. At the moment I am able to have the two separate plots. The dataframe resembles:

member_df=    
user_id    Age
1          23
2          34
3          63 
4          18
5          53  
...

using

ax1 = plt.subplot2grid((2,3), (0,0))
member_df.Age.plot(kind='kde', xlim=[16, 100])
ax1.set_xlabel('Age')

ax2 = plt.subplot2grid((2,3), (0,1))
member_df.Age.plot(kind='hist', bins=40)
ax2.set_xlabel('Age')

ax3 = ...

I understand that the kind='kde' will give me frequencies for the y-axis whereas kind='kde' will give a cumulative distribution, but is there a way to combine both and have the y-axis be represented by the frequencies?

like image 406
Lukasz Avatar asked Oct 11 '16 21:10

Lukasz


People also ask

How do you plot a plot in KDE?

KDE Plot described as Kernel Density Estimate is used for visualizing the Probability Density of a continuous variable. It depicts the probability density at different values in a continuous variable. We can also plot a single graph for multiple samples which helps in more efficient data visualization.


1 Answers

pd.DataFrame.plot() returns the ax it is plotting to. You can reuse this for other plots.

Try:

ax = member_df.Age.plot(kind='kde')
member_df.Age.plot(kind='hist', bins=40, ax=ax)
ax.set_xlabel('Age')

example
I plot hist first to put in background
Also, I put kde on secondary_y axis

import pandas as pd
import numpy as np


np.random.seed([3,1415])
df = pd.DataFrame(np.random.randn(100, 2), columns=list('ab'))

ax = df.a.plot(kind='hist')
df.a.plot(kind='kde', ax=ax, secondary_y=True)

enter image description here


response to comment
using subplot2grid. just reuse ax1

import pandas as pd
import numpy as np

ax1 = plt.subplot2grid((2,3), (0,0))

np.random.seed([3,1415])
df = pd.DataFrame(np.random.randn(100, 2), columns=list('ab'))

df.a.plot(kind='hist', ax=ax1)
df.a.plot(kind='kde', ax=ax1, secondary_y=True)

enter image description here

like image 110
piRSquared Avatar answered Nov 03 '22 17:11

piRSquared