Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Most efficient way of adding elements given the index list in numpy

Tags:

python

add

numpy

Assume we have a numpy array A with shape (N, ) and a matrix D with shape (M, 3) which has data and another matrix I with shape (M, 3) which has corresponding index of each data element in D. How can we construct A given D and I such that the repeated element indexes are added?

Example:

############# A[I] := D ###################################  
A = [0.5, 0.6]                         # Final Reduced Data Vector
D = [[0.1, 0.1 0.2], [0.2, 0.4, 0.1]]  # Data
I = [[0, 1, 0], [0, 1, 1]]             # Indices

For example:

A[0] = D[0][0] + D[0][2] + D[1][0]     # 0.5 = 0.1 + 0.2 + 0.2

Since in index matrix we have:

I[0][0] = I[0][2] = I[1][0] = 0

Target is to avoid looping over all elements to be efficient for large N, M (10^6-10^9).

like image 934
Roy Avatar asked Dec 02 '20 06:12

Roy


People also ask

Is it faster to append to list or NumPy array?

NumPy Arrays Are NOT Always Faster Than Lists " append() " adds values to the end of both lists and NumPy arrays. It is a common and very often used function. The script below demonstrates a comparison between the lists' append() and NumPy's append() .

Is NumPy more efficient than list?

Because the Numpy array is densely packed in memory due to its homogeneous type, it also frees the memory faster. So overall a task executed in Numpy is around 5 to 100 times faster than the standard python list, which is a significant leap in terms of speed.

Is NumPy indexing fast?

Indexing in NumPy is a reasonably fast operation.

What makes NumPy faster?

NumPy Arrays are faster than Python Lists because of the following reasons: An array is a collection of homogeneous data-types that are stored in contiguous memory locations. On the other hand, a list in Python is a collection of heterogeneous data types stored in non-contiguous memory locations.


3 Answers

I doubt you can get much faster than np.bincount - and notice how the official documentation provides this exact usecase

# Your example
A = [0.5, 0.6]
D = [[0.1, 0.1, 0.2], [0.2, 0.4, 0.1]]
I = [[0, 1, 0], [0, 1, 1]]

# Solution
import numpy as np    
D, I = np.array(D).flatten(), np.array(I).flatten()
print(np.bincount(I, D)) #[0.5 0.6]
like image 119
Leonardus Chen Avatar answered Oct 17 '22 23:10

Leonardus Chen


The shape of I and D doesn't matter: you can clearly ravel the arrays without changing the outcome:

index = np.ravel(I)
data = np.ravel(D)

Now you can sort both arrays according to I:

sorter = np.argsort(index)
index = index[sorter]
data = data[sorter]

This is helpful because now index looks like this:

0, 0, 0, 1, 1, 1

And data is this:

0.1, 0.2, 0.2, 0.1, 0.4, 0.1

Adding together runs of consecutive numbers should be easier than processing random locations. Let's start by finding the indices where the runs start:

runs = np.r_[0, np.flatnonzero(np.diff(index)) + 1]

Now you can use the fact that ufuncs like np.add have a partial reduce operation called reduceat. This allows you to sum regions of an array:

a = np.add.reduceat(data, runs)

If I is guaranteed to contain all indices in [0, A.size) at least once, you're done: just assign to A instead of a. If not, you can make the mapping using the fact that the start of each run in index is the target index:

A = np.zeros(n)
A[index[runs]] = a

Algorithmic complexity analysis:

  • ravel is O(1) in time and space if the data is in an array. If it's a list, this is O(MN) in time and space
  • argsort is O(MN log MN) in time and O(MN) in space
  • Indexing by sorter is O(MN) in time and space
  • Computing runs is O(MN) in time and O(MN + M) = O(MN) in space
  • reduceat is a single pass: O(MN) in time, O(M) in space
  • Reassigning A is O(M) in time and space

Total: O(MN log MN) time, O(MN) space

TL;DR

def make_A(D, I, M):
    index = np.ravel(I)
    data = np.ravel(D)
    sorter = np.argsort(index)
    index = index[sorter]

    if index[0] < 0 or index[-1] >= M:
        raise ValueError('Bad indices')

    data = data[sorter]
    runs = np.r_[0, np.flatnonzero(np.diff(index)) + 1]
    a = np.add.reduceat(data, runs)
    if a.size == M:
        return a
    A = np.zeros(M)
    A[index[runs]] = a
    return A
like image 31
Mad Physicist Avatar answered Oct 18 '22 00:10

Mad Physicist


If you know the size of A beforehand, as it seems you do, you can simply use add.at:

import numpy as np

D = [[0.1, 0.1, 0.2], [0.2, 0.4, 0.1]]
I = [[0, 1, 0], [0, 1, 1]]

arr_D = np.array(D)
arr_I = np.array(I)

A = np.zeros(2)

np.add.at(A, arr_I, arr_D)

print(A)

Output

[0.5 0.6]

If you don't know the size of A, you can use max to compute it:

A = np.zeros(arr_I.max() + 1)
np.add.at(A, arr_I, arr_D)
print(A)

Output

[0.5 0.6]

The time complexity of this algorithm is O(N), with also space complexity O(N).

The:

arr_I.max() + 1

is what bincount does under the hood, from the documentation:

The result of binning the input array. The length of out is equal to np.amax(x)+1.

That being said, bincount is at least one order of magnitude faster:

I = np.random.choice(1000, size=(1000, 3), replace=True)
D = np.random.random((1000, 3))
%timeit make_A_with_at(I, D, 1000)
213 µs ± 25 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit make_A_with_bincount(I, D)
11 µs ± 15.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
like image 24
Dani Mesejo Avatar answered Oct 17 '22 23:10

Dani Mesejo