I have the following datasets of three variables:
- df['Score'] Float dummy (1 or 0)
- df['Province'] an object column where each row is a region
- df['Product type'] an object indicating the industry.
I would like to create a jointplot where on the x axis I have the different industries, on the y axis the different provinces and as colours of my jointplot I have the relative frequency of the score. Something like this. https://seaborn.pydata.org/examples/hexbin_marginals.html
For the time being, I could only do the following
mean = df.groupby(['Province', 'Product type'])['score'].mean()
But i am not sure how to plot it.
Thanks!
If you are looking for a heatmap, you could use seaborn heatmap
function. However you need to pivot your table first.
Just creating a small example:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
score = [1, 1, 1, 0, 1, 0, 0, 0]
provinces = ['Place1' ,'Place2' ,'Place2', 'Place3','Place1', 'Place2','Place3','Place1']
products = ['Product1' ,'Product3' ,'Product2', 'Product2','Product1', 'Product2','Product1','Product1']
df = pd.DataFrame({'Province': provinces,
'Product type': products,
'score': score
})
My df
looks like:
'Province''Product type''score'
0 Place1 Product1 1
1 Place2 Product3 1
2 Place2 Product2 1
3 Place3 Product2 0
4 Place1 Product1 1
5 Place2 Product2 0
6 Place3 Product1 0
7 Place1 Product1 0
Then:
df_heatmap = df.pivot_table(values='score',index='Province',columns='Product type',aggfunc=np.mean)
sns.heatmap(df_heatmap,annot=True)
plt.show()
The result is:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With