<pre class="prettyprint"><code>v1 v2 yy 15.25 44.34 100.00 83.05 59.78 100.00 96.61 65.09 100.00 100.00 75.47 100.00 100.00 50.00 100.00 100.00 68.87 100.00 100.00 79.35 100.00 100.00 100.00 100.00 100.00 63.21 100.00 100.00 100.00 100.00 100.00 68.87 100.00 0.00 56.52 92.86 10.17 52.83 92.86 23.73 46.23 92.86 </code></pre> <p>In the dataframe above, I want to plot a heatmap using v1 and v2 as x and y axis and yy as the value. How can I do that in python? I tried seaborn:</p> <pre class="prettyprint"><code>df = df.pivot('v1', 'v2', 'yy') ax = sns.heatmap(df) </code></pre> <p>However, this does not work. Any other solution?</p>

<p>A seaborn <code>heatmap</code> plots categorical data. This means that each occuring value would take the same space in the heatmap as any other value, independent on how far they are separated numerically. This is usually undesired for numerical data. Instead one of the following techniques may be chosen.</p> <h3><code>Scatter</code></h3> <p>A colored scatter plot may be just as good as a heatmap. The colors of the points would represent the <code>yy</code> value.</p> <pre class="prettyprint"><code>ax.scatter(df.v1, df.v2, c=df.yy, cmap="copper") </code></pre> <p><img src="https://i.stack.imgur.com/PUm8x.png" alt="enter image description here"></p> <p></p> <div class="snippet" data-lang="js" data-hide="true" data-console="true" data-babel="false"> <div class="snippet-code snippet-currently-hidden"> <pre class="prettyprint snippet-code-css lang-css prettyprint-override"><code>u = u"""v1 v2 yy 15.25 44.34 100.00 83.05 59.78 100.00 96.61 65.09 100.00 100.00 75.47 100.00 100.00 50.00 100.00 100.00 68.87 100.00 100.00 79.35 100.00 100.00 100.00 100.00 100.00 63.21 100.00 100.00 100.00 100.00 100.00 68.87 100.00 0.00 56.52 92.86 10.17 52.83 92.86 23.73 46.23 92.86""" import pandas as pd import matplotlib.pyplot as plt import io df = pd.read_csv(io.StringIO(u), delim_whitespace=True ) fig, ax = plt.subplots() sc = ax.scatter(df.v1, df.v2, c=df.yy, cmap="copper") fig.colorbar(sc, ax=ax) ax.set_aspect("equal") plt.show()</code></pre> </div> </div> <h3><code>Hexbin</code></h3> <p>You may want to look into <code>hexbin</code>. The data would be shown in hexagonal bins and the data is aggregated as the mean inside each bin. The advantage here is that if you choose the gridsize large, it will look like a scatter plot, while if you make it small, it looks like a heatmap, allowing to adjust the plot easily to the desired resolution.</p> <pre class="prettyprint"><code>h1 = ax.hexbin(df.v1, df.v2, C=df.yy, gridsize=100, cmap="copper") h2 = ax2.hexbin(df.v1, df.v2, C=df.yy, gridsize=10, cmap="copper") </code></pre> <p><img src="https://i.stack.imgur.com/tIyZ9.png" alt="enter image description here"></p> <p></p> <div class="snippet" data-lang="js" data-hide="true" data-console="true" data-babel="false"> <div class="snippet-code snippet-currently-hidden"> <pre class="prettyprint snippet-code-css lang-css prettyprint-override"><code>u = u"""v1 v2 yy 15.25 44.34 100.00 83.05 59.78 100.00 96.61 65.09 100.00 100.00 75.47 100.00 100.00 50.00 100.00 100.00 68.87 100.00 100.00 79.35 100.00 100.00 100.00 100.00 100.00 63.21 100.00 100.00 100.00 100.00 100.00 68.87 100.00 0.00 56.52 92.86 10.17 52.83 92.86 23.73 46.23 92.86""" import pandas as pd import matplotlib.pyplot as plt import io df = pd.read_csv(io.StringIO(u), delim_whitespace=True ) fig, (ax, ax2) = plt.subplots(nrows=2) h1 = ax.hexbin(df.v1, df.v2, C=df.yy, gridsize=100, cmap="copper") h2 = ax2.hexbin(df.v1, df.v2, C=df.yy, gridsize=10, cmap="copper") fig.colorbar(h1, ax=ax) fig.colorbar(h2, ax=ax2) ax.set_aspect("equal") ax2.set_aspect("equal") ax.set_title("gridsize=100") ax2.set_title("gridsize=10") fig.subplots_adjust(hspace=0.3) plt.show()</code></pre> </div> </div> <h3><code>Tripcolor</code></h3> <p>A <code>tripcolor</code> plot can be used to obtain colored reagions in the plot according to the datapoints, which are then interpreted as the edges of triangles, colorized according the edgepoints' data. Such a plot would require to have more data available to give a meaningful representation.</p> <pre class="prettyprint"><code>ax.tripcolor(df.v1, df.v2, df.yy, cmap="copper") </code></pre> <p><img src="https://i.stack.imgur.com/DpEW1.png" alt="enter image description here"></p> <p></p> <div class="snippet" data-lang="js" data-hide="true" data-console="true" data-babel="false"> <div class="snippet-code snippet-currently-hidden"> <pre class="prettyprint snippet-code-css lang-css prettyprint-override"><code>u = u"""v1 v2 yy 15.25 44.34 100.00 83.05 59.78 100.00 96.61 65.09 100.00 100.00 75.47 100.00 100.00 50.00 100.00 100.00 68.87 100.00 100.00 79.35 100.00 100.00 100.00 100.00 100.00 63.21 100.00 100.00 100.00 100.00 100.00 68.87 100.00 0.00 56.52 92.86 10.17 52.83 92.86 23.73 46.23 92.86""" import pandas as pd import matplotlib.pyplot as plt import io df = pd.read_csv(io.StringIO(u), delim_whitespace=True ) fig, ax = plt.subplots() tc = ax.tripcolor(df.v1, df.v2, df.yy, cmap="copper") fig.colorbar(tc, ax=ax) ax.set_aspect("equal") ax.set_title("tripcolor") plt.show()</code></pre> </div> </div> <p>Note that a<code>tricontourf</code> plot may equally be suited, if more datapoints throughout the grid are available.</p> <pre class="prettyprint"><code>ax.tricontourf(df.v1, df.v2, df.yy, cmap="copper") </code></pre>

<p>The problem that your data has duplicate values like:</p> <pre class="prettyprint"><code>100.00 100.00 100.00 100.00 100.00 100.00 </code></pre> <p>You have to drop duplicate values then pivot and plot like here:</p> <pre class="prettyprint"><code>import seaborn as sns import pandas as pd # fill data df = pd.read_clipboard() df.drop_duplicates(['v1','v2'], inplace=True) pivot = df.pivot(index='v1', columns='v2', values='yy') ax = sns.heatmap(pivot,annot=True) plt.show() print (pivot) </code></pre> <p><img src="https://i.stack.imgur.com/clj20.png" alt="enter image description here"></p> <p>Pivot:</p> <pre class="prettyprint"><code>v2 44.34 46.23 50.00 52.83 56.52 59.78 63.21 65.09 \ v1 0.00 NaN NaN NaN NaN 92.86 NaN NaN NaN 10.17 NaN NaN NaN 92.86 NaN NaN NaN NaN 15.25 100.0 NaN NaN NaN NaN NaN NaN NaN 23.73 NaN 92.86 NaN NaN NaN NaN NaN NaN 83.05 NaN NaN NaN NaN NaN 100.0 NaN NaN 96.61 NaN NaN NaN NaN NaN NaN NaN 100.0 100.00 NaN NaN 100.0 NaN NaN NaN 100.0 NaN v2 68.87 75.47 79.35 100.00 v1 0.00 NaN NaN NaN NaN 10.17 NaN NaN NaN NaN 15.25 NaN NaN NaN NaN 23.73 NaN NaN NaN NaN 83.05 NaN NaN NaN NaN 96.61 NaN NaN NaN NaN 100.00 100.0 100.0 100.0 100.0 </code></pre>

Plotting heatmap for 3 columns in python with seaborn

v1      v2      yy
15.25   44.34   100.00
83.05   59.78   100.00
96.61   65.09   100.00
100.00  75.47   100.00
100.00  50.00   100.00
100.00  68.87   100.00
100.00  79.35   100.00
100.00  100.00  100.00
100.00  63.21   100.00
100.00  100.00  100.00
100.00  68.87   100.00
0.00    56.52   92.86
10.17   52.83   92.86
23.73   46.23   92.86

In the dataframe above, I want to plot a heatmap using v1 and v2 as x and y axis and yy as the value. How can I do that in python? I tried seaborn:

df = df.pivot('v1', 'v2', 'yy')
ax = sns.heatmap(df)

However, this does not work. Any other solution?

How do you plot heatmap between two columns in Python?

A simple way to plot a heatmap in Python is by importing and implementing the Seaborn library. Dark red means positive, Blue means negative. The stronger the color, the larger the correlation magnitude.

Can Seaborn use columns from pandas?

fortunately, the answer is yes. Pandas library has many built-in methods that simplify creating visualizations from Data-Frame and Series objects. Another library that we will explore is Seaborn, a statistical graphics library created by Michael Waskom.

A seaborn heatmap plots categorical data. This means that each occuring value would take the same space in the heatmap as any other value, independent on how far they are separated numerically. This is usually undesired for numerical data. Instead one of the following techniques may be chosen.

`Scatter`

A colored scatter plot may be just as good as a heatmap. The colors of the points would represent the yy value.

ax.scatter(df.v1, df.v2, c=df.yy,  cmap="copper")

enter image description here

u = u"""v1      v2      yy
15.25   44.34   100.00
83.05   59.78   100.00
96.61   65.09   100.00
100.00  75.47   100.00
100.00  50.00   100.00
100.00  68.87   100.00
100.00  79.35   100.00
100.00  100.00  100.00
100.00  63.21   100.00
100.00  100.00  100.00
100.00  68.87   100.00
0.00    56.52   92.86
10.17   52.83   92.86
23.73   46.23   92.86"""

import pandas as pd
import matplotlib.pyplot as plt
import io

df = pd.read_csv(io.StringIO(u), delim_whitespace=True )

fig, ax = plt.subplots()

sc = ax.scatter(df.v1, df.v2, c=df.yy,  cmap="copper")

fig.colorbar(sc, ax=ax)

ax.set_aspect("equal")


plt.show()

`Hexbin`

You may want to look into hexbin. The data would be shown in hexagonal bins and the data is aggregated as the mean inside each bin. The advantage here is that if you choose the gridsize large, it will look like a scatter plot, while if you make it small, it looks like a heatmap, allowing to adjust the plot easily to the desired resolution.

h1 = ax.hexbin(df.v1, df.v2, C=df.yy, gridsize=100, cmap="copper")
h2 = ax2.hexbin(df.v1, df.v2, C=df.yy, gridsize=10, cmap="copper")

enter image description here

u = u"""v1      v2      yy
15.25   44.34   100.00
83.05   59.78   100.00
96.61   65.09   100.00
100.00  75.47   100.00
100.00  50.00   100.00
100.00  68.87   100.00
100.00  79.35   100.00
100.00  100.00  100.00
100.00  63.21   100.00
100.00  100.00  100.00
100.00  68.87   100.00
0.00    56.52   92.86
10.17   52.83   92.86
23.73   46.23   92.86"""

import pandas as pd
import matplotlib.pyplot as plt
import io

df = pd.read_csv(io.StringIO(u), delim_whitespace=True )

fig, (ax, ax2) = plt.subplots(nrows=2)

h1 = ax.hexbin(df.v1, df.v2, C=df.yy, gridsize=100, cmap="copper")
h2 = ax2.hexbin(df.v1, df.v2, C=df.yy, gridsize=10, cmap="copper")

fig.colorbar(h1, ax=ax)
fig.colorbar(h2, ax=ax2)
ax.set_aspect("equal")
ax2.set_aspect("equal")
ax.set_title("gridsize=100")
ax2.set_title("gridsize=10")
fig.subplots_adjust(hspace=0.3)
plt.show()

`Tripcolor`

A tripcolor plot can be used to obtain colored reagions in the plot according to the datapoints, which are then interpreted as the edges of triangles, colorized according the edgepoints' data. Such a plot would require to have more data available to give a meaningful representation.

ax.tripcolor(df.v1, df.v2, df.yy,  cmap="copper")

enter image description here

u = u"""v1      v2      yy
15.25   44.34   100.00
83.05   59.78   100.00
96.61   65.09   100.00
100.00  75.47   100.00
100.00  50.00   100.00
100.00  68.87   100.00
100.00  79.35   100.00
100.00  100.00  100.00
100.00  63.21   100.00
100.00  100.00  100.00
100.00  68.87   100.00
0.00    56.52   92.86
10.17   52.83   92.86
23.73   46.23   92.86"""

import pandas as pd
import matplotlib.pyplot as plt
import io

df = pd.read_csv(io.StringIO(u), delim_whitespace=True )

fig, ax = plt.subplots()

tc = ax.tripcolor(df.v1, df.v2, df.yy,  cmap="copper")

fig.colorbar(tc, ax=ax)

ax.set_aspect("equal")
ax.set_title("tripcolor")

plt.show()

Note that atricontourf plot may equally be suited, if more datapoints throughout the grid are available.

ax.tricontourf(df.v1, df.v2, df.yy,  cmap="copper")

The problem that your data has duplicate values like:

100.00  100.00  100.00
100.00  100.00  100.00

You have to drop duplicate values then pivot and plot like here:

import seaborn as sns
import pandas as pd

# fill data

df = pd.read_clipboard()
df.drop_duplicates(['v1','v2'], inplace=True)
pivot = df.pivot(index='v1', columns='v2', values='yy')
ax = sns.heatmap(pivot,annot=True)
plt.show()

print (pivot)

enter image description here

Pivot:

v2      44.34   46.23   50.00   52.83   56.52   59.78   63.21   65.09   \
v1                                                                       
0.00       NaN     NaN     NaN     NaN   92.86     NaN     NaN     NaN   
10.17      NaN     NaN     NaN   92.86     NaN     NaN     NaN     NaN   
15.25    100.0     NaN     NaN     NaN     NaN     NaN     NaN     NaN   
23.73      NaN   92.86     NaN     NaN     NaN     NaN     NaN     NaN   
83.05      NaN     NaN     NaN     NaN     NaN   100.0     NaN     NaN   
96.61      NaN     NaN     NaN     NaN     NaN     NaN     NaN   100.0   
100.00     NaN     NaN   100.0     NaN     NaN     NaN   100.0     NaN   

v2      68.87   75.47   79.35   100.00  
v1                                      
0.00       NaN     NaN     NaN     NaN  
10.17      NaN     NaN     NaN     NaN  
15.25      NaN     NaN     NaN     NaN  
23.73      NaN     NaN     NaN     NaN  
83.05      NaN     NaN     NaN     NaN  
96.61      NaN     NaN     NaN     NaN  
100.00   100.0   100.0   100.0   100.0

Plotting heatmap for 3 columns in python with seaborn

Tags:

python

pandas

matplotlib

seaborn

heatmap

user308827

People also ask

2 Answers

`Scatter`

`Hexbin`

`Tripcolor`

ImportanceOfBeingErnest

Serenity

Recent Activity

Donate For Us

Plotting heatmap for 3 columns in python with seaborn

Tags:

python

pandas

matplotlib

seaborn

heatmap

user308827

People also ask

2 Answers

Scatter

Hexbin

Tripcolor

ImportanceOfBeingErnest

Serenity

Related questions

Recent Activity

Donate For Us

`Scatter`

`Hexbin`

`Tripcolor`