Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

matplotlib hexbin normalize

I would like to make multiple hexbin density map of x y data with matplotlib similarly to this one: http://matplotlib.org/1.4.0/examples/pylab_examples/hexbin_demo.html

But I would like to divide the counts per hexagon by a given number (the highest peek value from my density maps), so that all my denisty plots would have the same coloring and the colorbar would be [0,1] range for all plots.

Could someone show me a working example of that?

Thank you in anticipation,

Janos

like image 843
user2393987 Avatar asked Jan 09 '23 15:01

user2393987


1 Answers

I see two potential ways to do this.

Method 1

The first is to call hexbin to get your max value, then perform another hexbin call using the reduce_C_function input option to scale your data. The issue with performing normalization is that you don't know how many points are in each bin until after the hexbin is created. Working with the data in the example you linked to (but only creating the linear-scale plot) this would something like:

plt.subplot(111)
hb = plt.hexbin(x,y, cmap=plt.cm.YlOrRd_r)
plt.cla()
plt.hexbin(x, y,
           C=np.ones_like(y, dtype=np.float) / hb.get_array().max(),
           cmap=plt.cm.YlOrRd_r,
           reduce_C_function=np.sum)
plt.axis([xmin, xmax, ymin, ymax])
cb = plt.colorbar()

In the second hexbin call you must supply the C array in order to utilize the reduce_C_function option. In this case, C=np.ones_like(y) / hb.get_array().max() is all you need because you then simply sum the values.

Note that it probably makes sense to clear the axes after the first hexbin call.

One issue with this approach is that you will have empty bins (white space) where there are no points. If you want the background to be the same color as a zero value, you could add plt.gca().set_axis_bgcolor(plt.cm.YlOrRd_r(0)).

Method 2

The other approach would be to simply use the autoscaling inherent in hexbin, and simply relabel the colorbar. For example:

plt.subplot(111)
hb = plt.hexbin(x,y, cmap=plt.cm.YlOrRd_r)
plt.axis([xmin, xmax, ymin, ymax])
cb = plt.colorbar()
cb.set_ticks(np.linspace(hb.get_array().min(), hb.get_array().max(), 6))
cb.set_ticklabels(np.linspace(0, 1., 6))

Note here that one must use the colorbar tick setter in units of the count, but then you set the labels to be in the range you want. Personally, I prefer this second method just because it's a bit cleaner, but I can imagine cases where the first is more useful.

like image 149
farenorth Avatar answered Jan 18 '23 05:01

farenorth