(Re-)creating "numpy.sum" with numba (supporting "axis" along which to reduce)

Tags:

I want to recreate a numpy.sum-like function. I don't plan to recreate numpy.sum but a similar function with the same principle: Iterate over the items and do something with each of them and then return a result.

How can I make one numba function that understands the "reduce along axis x" behaviour of the numpy-functions.

Let's say the basic function looks like this (taken from the numba source code):

def numba_sum(arr):
    s = 0.
    for val in np.nditer(arr):
        s += val.item()
    return s

This works great if I numba.jit that, but then it doesn't support any axis-argument.

numba.vectorize isn't much better, it offers .reduce(axis=x) but only if the function is binary (accepts two arguments) , which the one above is not, and even then it supports only one scalar axis.

numba.guvectorize can help, but I need to explicitly define when creating the function along which axis (if any) I want to reduce the function.

In short how can I make the function numba_sum work like numpy.sum, that is, it should support:

axis=None,
axis=x (x integer) and
axis=(x,y)

in nopython=True mode?

888

asked Jan 28 '17 02:01

MSeifert

1 Answers

I know the following is not the most concise or efficient implementation, but it is a working one! The general idea is that you need one loop to handle the output and other loop to aggregate the values over the indices the correspond to the output.

I have tried to minimise the use of various numpy operations inside numba keeping it closer to something implemented in C. (for the sake of clarity, np.ndindex was kept.) Some setup needs to occur before the function is called, and we just do this in plain python.

This implementation is by no means optimal or efficient, but it gets the job done and may help with inspiration for a more efficient version.

the indices are recomputed on each loop from the positions and the multiplies, this is really inefficient and could be improved
the np.index operations could be replaced

import numpy as np
from numpy.core.numeric import normalize_axis_index
import numba

@numba.njit()
def _numba_sum_into(arr, out, mask_out, mask_agg, shape_out, shape_agg):
    # get the multipliers
    # these times the position gives the index in the flattened array
    # this uses the same logic as np.ravel_multi_index
    multipliers = np.ones(arr.ndim, dtype='int32')
    for i in range(arr.ndim - 2, -1, -1):
        multipliers[i] = arr.shape[i + 1] * multipliers[i + 1]
    # multiplier components
    multipliers_agg = multipliers[mask_agg]
    multipliers_out = multipliers[mask_out]
    # flattened array
    a = arr.flatten()
    # loop over the kept values
    for pos_out in np.ndindex(shape_out):
        total = 0.
        # this uses the same logic as np.ravel_multi_index
        i = 0
        for p, m in zip(pos_out, multipliers_out):
            i += p * m
        # loop over the aggregate values
        for pos_agg in np.ndindex(shape_agg):
            # this uses the same logic as np.ravel_multi_index
            j = 0
            for p, m in zip(pos_agg, multipliers_agg):
                j += p * m
            # update the total
            total += a[i + j]
        # save the total
        out[pos_out] = total
    # done!
    return out

@lru_cache()
def _normalize_axis(axis, ndim: int) -> tuple:
    if axis is None:
        axis = np.arange(ndim)
    axis = np.core.numeric.normalize_axis_tuple(axis, ndim, allow_duplicate=False)
    return axis

@lru_cache()
def _get_out_and_agg_parts(norm_axis, ndim: int, shape: tuple):
    axis = np.array(norm_axis)
    # get used dims
    mask_agg = np.zeros(ndim, dtype='bool')
    mask_agg[axis] = True
    mask_out = ~mask_agg
    # make immutable
    mask_agg.flags['WRITEABLE'] = False
    mask_out.flags['WRITEABLE'] = False
    # get the shape
    shape = np.array(shape)
    shape_agg = tuple(shape[mask_agg].tolist())
    shape_out = tuple(shape[mask_out].tolist())
    # done
    return mask_out, mask_agg, shape_out, shape_agg

def numba_sum(arr, axis=None):
    axis = _normalize_axis(axis, arr.ndim)
    # get the various shapes
    mask_out, mask_agg, shape_out, shape_agg = _get_out_and_agg_parts(axis, arr.ndim, arr.shape)
    # make the output array
    out = np.zeros(shape_out, dtype=arr.dtype)
    # write into the array
    _numba_sum_into(arr, out, mask_out, mask_agg, shape_out, shape_agg)
    # done!
    return out

if __name__ == '__main__':
    arr = np.random.random([2, 3, 4])

    print(numba_sum(arr, axis=None))
    print(numba_sum(arr, axis=(0, 1, 2)))
    print(numba_sum(arr, axis=(0, 2)))
    print(numba_sum(arr, axis=(0, -1)))
    print(numba_sum(arr, axis=0))
    print(numba_sum(arr, axis=-1))

175

answered Nov 15 '22 01:11

Nathan

Related questions
                            
                                Overcome memory constraints while using multiprocessing
                            
                                Can a Tkinter button have children?
                            
                                How to handle BigTable Scan InvalidChunk exceptions?
                            
                                Time-based .rolling() fails with group by
                            
                                Optimising iterative computation of values based on growth rate
                            
                                ThreadPoolExecutor + Requests == deadlock?
                            
                                Python AWS Lambda 301 redirect
                            
                                Python, pymysql class encapsulation of SSCursor not working as expected
                            
                                Template solutions similar to thymeleaf "natural templates" for jinja2/python/django?
                            
                                Cosine similarity of word2vec more than 1
                            
                                How to call Python script from SAS JMP environment using jsl script
                            
                                What is Python's *Args and **kwargs equivalent in PHP? [duplicate]
                            
                                Using Single Celery Server For Multi Django Projects
                            
                                django+uwsgi logging with TimedRotatingFileHandler "overwrites rotated log file"
                            
                                How to determine feature importance of non linear kernals in SVM
                            
                                Python 3 - Gaussian divisors of a Gaussian integer
                            
                                Hacky way to augment multichannel images in Keras
                            
                                Why Pandas .loc speed in Pandas depends on DataFrame initialization? How to make MultiIndex .loc as fast as possible?
                            
                                keras: issue using ImageDataGenerator and KFold for fit_generator
                            
                                Pruning dir-filename in coverage.py html report

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

(Re-)creating "numpy.sum" with numba (supporting "axis" along which to reduce)

Tags:

python

vectorization

numpy

numba

MSeifert

People also ask

1 Answers

Nathan

Recent Activity

Donate For Us