I am trying to reimplement in python an IDL function:
http://star.pst.qub.ac.uk/idl/REBIN.html
which downsizes by an integer factor a 2d array by averaging.
For example:
>>> a=np.arange(24).reshape((4,6)) >>> a array([[ 0, 1, 2, 3, 4, 5], [ 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16, 17], [18, 19, 20, 21, 22, 23]])
I would like to resize it to (2,3) by taking the mean of the relevant samples, the expected output would be:
>>> b = rebin(a, (2, 3)) >>> b array([[ 3.5, 5.5, 7.5], [ 15.5, 17.5, 19.5]])
i.e. b[0,0] = np.mean(a[:2,:2]), b[0,1] = np.mean(a[:2,2:4])
and so on.
I believe I should reshape to a 4 dimensional array and then take the mean on the correct slice, but could not figure out the algorithm. Would you have any hint?
With the help of Numpy numpy. resize(), we can resize the size of an array. Array can be of any shape but to resize it we just need the size i.e (2, 2), (2, 3) and many more. During resizing numpy append zeros if values at a particular place is missing.
To calculate the average separately for each column of the 2D array, use the function call np. average(matrix, axis=0) setting the axis argument to 0. The resulting array has three average values, one per column of the input matrix .
there is no converting the dimensions of a numpy array in python. A numpy array is simply a section of your RAM. You can't append to it in the sense of literally adding bytes to the end of the array, but you can create another array and copy over all the data (which is what np. append(), or np.
Because the Numpy array is densely packed in memory due to its homogeneous type, it also frees the memory faster. So overall a task executed in Numpy is around 5 to 100 times faster than the standard python list, which is a significant leap in terms of speed.
Here's an example based on the answer you've linked (for clarity):
>>> import numpy as np >>> a = np.arange(24).reshape((4,6)) >>> a array([[ 0, 1, 2, 3, 4, 5], [ 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16, 17], [18, 19, 20, 21, 22, 23]]) >>> a.reshape((2,a.shape[0]//2,3,-1)).mean(axis=3).mean(1) array([[ 3.5, 5.5, 7.5], [ 15.5, 17.5, 19.5]])
As a function:
def rebin(a, shape): sh = shape[0],a.shape[0]//shape[0],shape[1],a.shape[1]//shape[1] return a.reshape(sh).mean(-1).mean(1)
J.F. Sebastian has a great answer for 2D binning. Here is a version of his "rebin" function that works for N dimensions:
def bin_ndarray(ndarray, new_shape, operation='sum'): """ Bins an ndarray in all axes based on the target shape, by summing or averaging. Number of output dimensions must match number of input dimensions and new axes must divide old ones. Example ------- >>> m = np.arange(0,100,1).reshape((10,10)) >>> n = bin_ndarray(m, new_shape=(5,5), operation='sum') >>> print(n) [[ 22 30 38 46 54] [102 110 118 126 134] [182 190 198 206 214] [262 270 278 286 294] [342 350 358 366 374]] """ operation = operation.lower() if not operation in ['sum', 'mean']: raise ValueError("Operation not supported.") if ndarray.ndim != len(new_shape): raise ValueError("Shape mismatch: {} -> {}".format(ndarray.shape, new_shape)) compression_pairs = [(d, c//d) for d,c in zip(new_shape, ndarray.shape)] flattened = [l for p in compression_pairs for l in p] ndarray = ndarray.reshape(flattened) for i in range(len(new_shape)): op = getattr(ndarray, operation) ndarray = op(-1*(i+1)) return ndarray
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With