Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

X Y Z array data to heatmap

I couldn't quite find a consensus answer for this question or one that fits my needs -- I have data in three columns of a text file: X, Y, and Z. Columns are tab-deliminated. I would like to make a heatmap representation of these data with Python where X and Y positions are shaded by the value in Z, which ranges from 0 to 1 (a discrete probability of X and Y). I was trying seaborn's heatmap package and matplotlib's pcolormesh, but unfortunately these need 2D data arrays.

My data runs through X from 1 to 37 for constant y then iterates by 0.1 in the y. y max fluctuates based on the data set, but ymin is always 0.

[X Y Z] row1[1...37 0.0000 Zvalue], row2[1...37 0.1000 Zvalue] etc.

import numpy as np
from numpy import *
import pandas as pd
import seaborn as sns
sns.set()

df = np.loadtxt(open("file.txt", "rb"), delimiter="\t").astype("float")

Any tips for next steps?

like image 603
Bobby Hollingsworth Avatar asked Aug 02 '17 20:08

Bobby Hollingsworth


1 Answers

If I understand you correctly you have three columns with X and Y denoting the position of a value Z.

Consider the following example. There are three columns: X and Y contain positional information (categories in this case) and Z contains the values for shading the heatmap.

x = np.array(['a','b','c','a','b','c','a','b','c'])
y = np.array(['a','a','a','b','b','b','c','c','c'])
z = np.array([0.3,-0.3,1,0.5,-0.25,-1,0.25,-0.23,0.25])

Then we create a dataframe from these columns and transpose them (so x,y and z actually become columns). Give column names and make sure Z_value is a number.

df = pd.DataFrame.from_dict(np.array([x,y,z]).T)
df.columns = ['X_value','Y_value','Z_value']
df['Z_value'] = pd.to_numeric(df['Z_value'])

resulting in this dataframe.

X_value Y_value Z_value
0   a   a   0.30
1   b   a   -0.30
2   c   a   1.00
3   a   b   0.50
4   b   b   -0.25
5   c   b   -1.00
6   a   c   0.25
7   b   c   -0.23
8   c   c   0.25

From this you cannot create a heatmap, however by calling df.pivot('Y_value','X_value','Z_value') you pivot the dataframe to a form that can be used for a heatmap.

pivotted= df.pivot('Y_value','X_value','Z_value')

The resulting dataframe looks like this.

X_value a   b   c
Y_value         
a   0.30    -0.30   1.00
b   0.50    -0.25   -1.00
c   0.25    -0.23   0.25

You can then feed pivotted to the sns.heatmap to create your heatmap.

sns.heatmap(pivotted,cmap='RdBu')

Resulting in this heatmap.

enter image description here

You may need to make some adjustments to the code for your precise needs. But since I had no example data to go from I needed to make my own example.

like image 99
error Avatar answered Nov 15 '22 17:11

error