I have 2 data tables with the dimensions 4x25
. Each table is from a different point in time, but has exactly the same meta data, in essence the same column and row headers.
Given the large number of columns, I thought it best to represent this using a heatmap
using the seaborn
library for Python
. However, I need to include both tables in the same plot. I am able to create a single heatmap representing a single data table as so.
df = pd.DataFrame(raw_data)
ax = sns.heatmap(df)
ax.set(yticklabels=labels)
However, I'm not sure how to combine two data tables into the same heatmap. The only way I can think of is to just create a new DataFrame
of dimension 4x50
and then fit both tables into that one and plot that using the heatmap. But then, I need help with the following issues:
Any help with the above issues would be very helpful.
Note: I'm not bent on representing the data as I've suggested above or even using a heatmap. If there are other suggestions for plotting, please let me know.
To concatenate heatmaps, simply use + operator. Under default mode, dendrograms from the second heatmap will be removed and row orders will be the same as the first one. Also row names for the first two heatmaps are removed as well. The returned value of the concatenation is a HeatmapList object.
In Seaborn, we will plot multiple graphs in a single window in two ways. First with the help of Facetgrid() function and other by implicit with the help of matplotlib. data: Tidy dataframe where each column is a variable and each row is an observation.
Correlation Heatmap Pandas / Seaborn Code Example Method corr() is invoked on the Pandas DataFrame to determine the correlation between different variables including predictor and response variables. The Seaborn heatmap() method is used to create the heat map representing the correlation matrix.
A website heatmap is a visual representation of how visitors interact with each element on your website. It shows which sections get more clicks and hold your visitor's attention.
One possible way of showing two seaborn heatmaps side by side in a figure would be to plot them to individual subplots. One may set the space between the subplots to very small (wspace=0.01
) and position the respective colorbars and ticklabels outside of that gap.
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
df = pd.DataFrame(np.random.rand(25,4), columns=list("ABCD"))
df2 = pd.DataFrame(np.random.rand(25,4), columns=list("WXYZ"))
fig, (ax,ax2) = plt.subplots(ncols=2)
fig.subplots_adjust(wspace=0.01)
sns.heatmap(df, cmap="rocket", ax=ax, cbar=False)
fig.colorbar(ax.collections[0], ax=ax,location="left", use_gridspec=False, pad=0.2)
sns.heatmap(df2, cmap="icefire", ax=ax2, cbar=False)
fig.colorbar(ax2.collections[0], ax=ax2,location="right", use_gridspec=False, pad=0.2)
ax2.yaxis.tick_right()
ax2.tick_params(rotation=0)
plt.show()
The best part about matplotlib/seaborn libraries is that everything is plotted in the same figure until you clear it. You can use the mask argument in sns.heatmap
to get a diagonal heatmap plot. To get a "mixed" heatmap, such that you can have two different types of data plotted with different colormaps, you can do something like this:
from sklearn.datasets import load_iris
import seaborn as sns
import pandas as pd
import numpy as np
data = load_iris()
df= pd.DataFrame(data.data,columns = data.feature_names)
df['target'] = data.target
df_0 = df[df['target']==0]
df_1 = df[df['target']==1]
df_0.drop('target',axis=1,inplace=True)
df_1.drop('target',axis=1,inplace=True)
matrix_0 = np.triu(df_0.corr())
matrix_1 = np.tril(df_1.corr())
import seaborn as sns
from mpl_toolkits.axes_grid1.axes_divider import make_axes_locatable
from mpl_toolkits.axes_grid1.colorbar import colorbar
sns.heatmap(df_0.corr(),annot=True,mask=matrix_0,cmap="BuPu")
sns.heatmap(df_1.corr(),annot=True,mask=matrix_1,cmap="YlGnBu")
Hope this is what your second idea was. Note that this will only work when you have same column names.
A slight twist of Quark's answer to avoid 0 values in the matrix that will cause those values showing the later cmap. We can compute boolean matrices to mask the upper/ lower triangle. More info here. Also added the limits of the cbars to fix the scale.
from sklearn.datasets import load_iris
import seaborn as sns
import pandas as pd
import numpy as np
data = load_iris()
df= pd.DataFrame(data.data,columns = data.feature_names)
df['target'] = data.target
df_0 = df[df['target']==0]
df_1 = df[df['target']==1]
df_0.drop('target',axis=1,inplace=True)
df_1.drop('target',axis=1,inplace=True)
mask_0 = np.zeros_like(df_0.corr(), dtype=np.bool_)
mask_0[np.tril_indices_from(mask_0)] = True
mask_1 = mask_0.T
import seaborn as sns
from mpl_toolkits.axes_grid1.axes_divider import make_axes_locatable
from mpl_toolkits.axes_grid1.colorbar import colorbar
sns.heatmap(df_0.corr(), annot=True, mask=mask_0, cmap="Blues", vmin=0, vmax=1)
sns.heatmap(df_1.corr(), annot=True, mask=mask_1, cmap="Greens", vmin=0, vmax=1)
2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With