Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

get bins coordinates with hexbin in matplotlib

I use matplotlib's method hexbin to compute 2d histograms on my data. But I would like to get the coordinates of the centers of the hexagons in order to further process the results.

I got the values using get_array() method on the result, but I cannot figure out how to get the bins coordinates.

I tried to compute them given number of bins and the extent of my data but i don't know the exact number of bins in each direction. gridsize=(10,2) should do the trick but it does not seem to work.

Any idea?

like image 281
user1151446 Avatar asked Oct 18 '12 09:10

user1151446


People also ask

How do you read a Hexbin plot?

A hexbin plot is useful to represent the relationship of 2 numerical variables when you have a lot of data points. Without overlapping of the points, the plotting window is split into several hexbins. The color of each hexbin denotes the number of points in it.

What is Hexbin?

Hexbin map uses hexagons to split the area into several parts and attribute a color to it. The graphic area (which can be a geographical area) is divided into a multitude of hexagons and the number of data points in each is counted and represented using a color gradient.


2 Answers

I think this works.

from __future__ import division
import numpy as np
import math
import matplotlib.pyplot as plt

def generate_data(n):
    """Make random, correlated x & y arrays"""
    points = np.random.multivariate_normal(mean=(0,0),
        cov=[[0.4,9],[9,10]],size=int(n))
    return points

if __name__ =='__main__':

    color_map = plt.cm.Spectral_r
    n = 1e4
    points = generate_data(n)

    xbnds = np.array([-20.0,20.0])
    ybnds = np.array([-20.0,20.0])
    extent = [xbnds[0],xbnds[1],ybnds[0],ybnds[1]]

    fig=plt.figure(figsize=(10,9))
    ax = fig.add_subplot(111)
    x, y = points.T
    # Set gridsize just to make them visually large
    image = plt.hexbin(x,y,cmap=color_map,gridsize=20,extent=extent,mincnt=1,bins='log')
    # Note that mincnt=1 adds 1 to each count
    counts = image.get_array()
    ncnts = np.count_nonzero(np.power(10,counts))
    verts = image.get_offsets()
    for offc in xrange(verts.shape[0]):
        binx,biny = verts[offc][0],verts[offc][1]
        if counts[offc]:
            plt.plot(binx,biny,'k.',zorder=100)
    ax.set_xlim(xbnds)
    ax.set_ylim(ybnds)
    plt.grid(True)
    cb = plt.colorbar(image,spacing='uniform',extend='max')
    plt.show()

enter image description here

like image 133
user1868739 Avatar answered Oct 17 '22 14:10

user1868739


I would love to confirm that the code by Hooked using get_offsets() works, but I tried several iterations of the code mentioned above to retrieve center positions and, as Dave mentioned, get_offsets() remains empty. The workaround that I found is to use the non-empty 'image.get_paths()' option. My code takes the mean to find centers but which means it is just a smidge longer, but it does work.

The get_paths() option returns a set of x,y coordinates embedded that can be looped over and then averaged to return the center position for each hexagram.

The code that I have is as follows:

counts=image.get_array() #counts in each hexagon, works great
verts=image.get_offsets() #empty, don't use this
b=image.get_paths()   #this does work, gives Path([[]][]) which can be plotted

for x in xrange(len(b)):
    xav=np.mean(b[x].vertices[0:6,0]) #center in x (RA)
    yav=np.mean(b[x].vertices[0:6,1]) #center in y (DEC)
    plt.plot(xav,yav,'k.',zorder=100)
like image 43
Astronomyde Avatar answered Oct 17 '22 13:10

Astronomyde