Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas how to access a multi-index DataFrame to select values to make a heatmap..!

thanks for taking time to click on this..!

Right I'm trying to make a heatmap..

The problem I have is I have data like this:

please see screen show of DataFrame:

g.head(11)
Out = :

           profit
a    b    
1.04 1.09  0.886
1.08 1.03  0.03
     1.05  0.17
     1.09  0.39
     1.16  2.85
1.1  1.12 -0.346
     1.14  0.34
     1.29  1.23
1.11 1.06  0.87
     1.08  1.23
     1.09 -2.86
  • These values are all unique, meaning that there is only one profit for each combination of a & b

  • I want to : plot a heatmap of x,y and the colour is the z value..

My x-coordinate is a list that goes from 1.01 to 2.0 at 0.01 intervals

My y-coordinate is a list that does the same e.g. [1.01,1.02,1.03.. 1.99, 2.0] So I make a meshgrid of this x & y

My z coordinate I want to be 'profit' from the column above, when x = a and b = y, plot the profit

But I'm having problems vectorising, extracting profit using x & y with a meshgrid and looking up profit when a = x and b = y.. I'll explain some more

Make a meshgrid with x & y For all combinations of x & y, find a & b and lookup profit, using a = x and b = y If no such combination exists, return 0 If the combination exists, if profit is +, plot a green point If profit is -, plot a red point. The bigger the profit, the brighter the green The bigger the loss, the brighter the red, Zero can be represented by eg blue

Here was an attempt as some code I did.. I've changed it round so much trying different things that it prob wont make much sense, but you will get a gist of what I've been trying to do..

def z_func(x,y):
    #print 'x = ',x
    q = plotDF[(plotDF.x == x) & (plotDF.y == y)].profit.values
    return q
    #return x + y
    #return x.astype(float) + y.astype(float)


X,Y = meshgrid(x, y) # grid of point
Z = z_func(X, Y) # evaluation of the function on the grid
im = imshow(Z,cmap=cm.RdBu) # drawing the function
colorbar(im)
show()

Guys, I've been playing round with this all day, reading stuff on line trying lookup(), also taking out the hierarahical index and just having a dataframe with x, y and profit columns.. but I keep getting silly errors, err.. I don't actually know what I'm doing!

Can anyone shed some light on this? Would be very much appreciated!!!!!

P.s. Happy New Year 2014 guys!

like image 327
user3087320 Avatar asked Mar 21 '23 03:03

user3087320


1 Answers

Instead of setting up a meshgrid, just unstack. That positions b as the columns and leaves a as the rows, with each value in its correct place and NaN in an where the a/b combination does not exist. You can fill those with 0.

Then, instead of imshow, use pcolor.

plt.pcolor(df.unstack().fillna(0))

You might need to use sort_index to get the columns and rows in sequential order, depending on df is set up.

This answer is related but not identical.

like image 115
Dan Allan Avatar answered Apr 12 '23 05:04

Dan Allan