I want to build a heatmap of this data:
curation1 curation2 overlap
1 2 0
1 3 1098
1 4 11
1 5 137
1 6 105
1 7 338
2 3 351
2 4 0
2 5 1
2 6 0
2 7 0
3 4 132
3 5 215
3 6 91
3 7 191
4 5 6
4 6 10
4 7 19
5 6 37
5 7 95
6 7 146
I made a heatmap with this code:
import sys
import pandas as pd
import matplotlib
matplotlib.use('Agg')
import matplotlib.ticker as ticker
import matplotlib.cm as cm
import matplotlib as mpl
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
from matplotlib import colors
data_raw = pd.read_csv(sys.argv[1],sep = '\t')
data_raw["curation1"] = pd.Categorical(data_raw["curation1"], data_raw.curation1.unique())
data_raw["curation2"] = pd.Categorical(data_raw["curation2"], data_raw.curation2.unique())
data_matrix = data_raw.pivot("curation1", "curation2", "overlap")
fig = plt.figure()
fig, ax = plt.subplots(1,1, figsize=(12,12))
heatplot = ax.imshow(data_matrix,cmap = 'BuPu')
#ax.set_xticklabels(data_matrix.columns)
#ax.set_yticklabels(data_matrix.index)
tick_spacing = 1
#ax.xaxis.set_major_locator(ticker.MultipleLocator(tick_spacing))
#ax.yaxis.set_major_locator(ticker.MultipleLocator(tick_spacing))
ax.set_title("Overlap")
fig.savefig('output.pdf')
The output looks like this:
I have three questions:
You can see the color scheme is a bit 'off' in the sense that most of the data is very lightly colored, and there is a random purple box to indicate '0'. Ideally, I would like this heatmap being different shades of green, with the darkest green being the highest number, to the lightest (but still clearly visible) green being the lowest number. I tried to play around with the 'cmap' argument, e.g. changing it to 'winter' as described in the python tutorial here; but I'm doing something wrong. Could someone please tell me where specifically I could change this?
color bar: I would like to add a color bar, but I guess I need to sort out question 1 first.
asymmetrical: as you can see, this plot is asymmetrical. Is it possible to plot half of a heat map (e.g. get rid of the unnecessary lines and possibly moving the axis labels to the right hand side of the plot instead?; if not this isn't a big deal because I can re-jig it in powerpoint).
This will solve your first two problems -
fig = plt.figure()
fig, ax = plt.subplots(1,1, figsize=(12,12))
heatplot = ax.imshow(data_matrix,cmap = 'Greens')
cbar = fig.colorbar(heatplot, ticks=[data_raw.overlap.min(), data_raw.overlap.max()])
tick_spacing = 1
ax.set_title("Overlap")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With