Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Apply seaborn heatmap columnwise on pandas dataframe

I was trying to use a heatmap form seaborn on a pivoted pandas dataframe like in the hyperlink which works

df = pd.DataFrame(np.random.randint(1,100,size = (3,2)))
df.columns = ['A','B']
df
sns.heatmap(df, annot=True, fmt="d", linewidths=.5,cmap="RdYlGn")

Output of code block - Entire Dataframe formatted as single heatmap The output picks 45 as min and 86 as max and color codes the entire dataframe

But what i was unable to do was to apply the heatmap column wise i.e. like conditional formatting applied column by colummn instead of for the whole dataframe. like in the example in this hyperlink -

Output required/expected

For col1 the min of 45 and and max of 88 is picked and formatted , for col2 70 & 86 are picked respectively Conditional formatted column wise but still displayed as a table. . In the examples i saw either the rest of the df was made to zeroes and only 1 column was formatted or the whole dataframe got the formatting

Can anyone help on this please

like image 764
Pystache Avatar asked May 17 '17 06:05

Pystache


2 Answers

You can also scale each column to a min of zero and max of 1, pass that to the heatmap, and annotate with the original values.

scaled_df = (df - df.min(axis=0))/(df.max(axis=0) - df.min(axis=0))
sns.heatmap(scaled_df, annot=df, fmt="d", linewidths=.5, cmap="RdYlGn")

Note that you will likely want to remove the colorbar with cbar=False since the solution necessarily requires different scales for each column.

Alternately, sklearn.preprocessing.minmax_scale can be used instead of scaling manually.

from sklearn.preprocessing import minmax_scale

scaled_df = minmax_scale(df)
sns.heatmap(scaled_df, annot=df, fmt="d", linewidths=.5, cmap="RdYlGn")
like image 65
elz Avatar answered Nov 09 '22 15:11

elz


Thanks @Implus3H that example helped Here is a modified version of the code as a function which can do column wise conditional format just in case if it will be useful to anyone else

df is an input dataframe whose columns will get color coded shades of red by default in the below example function

def columnwise_conditionalformat(df, color = 'Reds'):
    nrows = len(df)
    ncols = len(df.columns)
    fig, ax = plt.subplots()
    for i in range(ncols):
        truthar = [True]*ncols
        truthar[i] = False
        mask = truthar = np.array(nrows * [truthar], dtype=bool)
        red = np.ma.masked_where(mask, df)
        ax.pcolormesh(red, cmap=color)

    for y in range(df.shape[0]):
        for x in range(df.shape[1]):
            plt.text(x+.5,y+.5,'%.1f'% df.ix[y, x],
                    horizontalalignment='center',
                     verticalalignment='center'
                    )
    plt.show()
like image 40
Pystache Avatar answered Nov 09 '22 13:11

Pystache