Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Two-dimensional np.digitize

I have two-dimensional data and I have a bunch of two-dimensional bins generated with scipy.stats.binned_statistic_2d. For each data point, I want the index of the bin it occupies. This is exactly what np.digitize is for, but as far as I can tell, it only deals with one-dimensional data. This stackexchange seems to have an answer, but that is totally generalized to n-dimensions. Is there a more straightforward solution for two dimensions?

like image 296
Alex Avatar asked Jul 26 '15 08:07

Alex


People also ask

What is NP digitize?

With the help of np. digitize() method, we can get the indices of the bins to which the each value is belongs to an array by using np. digitize() method. Syntax : np.digitize(Array, Bin, Right) Return : Return an array of indices of the bins.


2 Answers

You can already get the bin index of each observation from the fourth return variable of scipy.stats.binned_statistic_2d:

Returns:  
  statistic : (nx, ny) ndarray
      The values of the selected statistic in each two-dimensional bin.
  xedges : (nx + 1) ndarray
      The bin edges along the first dimension.
  yedges : (ny + 1) ndarray
      The bin edges along the second dimension.
  binnumber : (N,) array of ints or (2,N) ndarray of ints
      This assigns to each element of sample an integer that
      represents the bin in which this observation falls. The
      representation depends on the expand_binnumbers argument.
      See Notes for details.
like image 128
ali_m Avatar answered Nov 10 '22 05:11

ali_m


a simple solution using numpy:

bins = [[0.3, 0.5, 0.7], [0.3, 0.7]]
values = np.random.random((10, 2))
digitized = []
for i in range(len(bins)):
    digitized.append(np.digitize(values[:, i], bins[i], right=False))
digitized = np.concatenate(digitized).reshape(10, 2)
like image 33
Alfredo Avatar answered Nov 10 '22 05:11

Alfredo