Assume we have a numpy array A with shape (N, ) and a matrix D with shape (M, 3) which has data and another matrix I with shape (M, 3) which has corresponding index of each data element in D. How can we construct A given D and I such that the repeated element indexes are added? Example: <pre class="prettyprint"><code>############# A[I] := D ################################### A = [0.5, 0.6] # Final Reduced Data Vector D = [[0.1, 0.1 0.2], [0.2, 0.4, 0.1]] # Data I = [[0, 1, 0], [0, 1, 1]] # Indices </code></pre> For example: <pre class="prettyprint"><code>A[0] = D[0][0] + D[0][2] + D[1][0] # 0.5 = 0.1 + 0.2 + 0.2 </code></pre> Since in index matrix we have: <pre class="prettyprint"><code>I[0][0] = I[0][2] = I[1][0] = 0 </code></pre> Target is to avoid looping over all elements to be efficient for large N, M (10^6-10^9).

I doubt you can get much faster than <code>np.bincount</code> - and notice how the official documentation provides this exact usecase <pre class="prettyprint"><code># Your example A = [0.5, 0.6] D = [[0.1, 0.1, 0.2], [0.2, 0.4, 0.1]] I = [[0, 1, 0], [0, 1, 1]] # Solution import numpy as np D, I = np.array(D).flatten(), np.array(I).flatten() print(np.bincount(I, D)) #[0.5 0.6] </code></pre>

The shape of <code>I</code> and <code>D</code> doesn't matter: you can clearly ravel the arrays without changing the outcome: <pre class="prettyprint"><code>index = np.ravel(I) data = np.ravel(D) </code></pre> Now you can sort both arrays according to <code>I</code>: <pre class="prettyprint"><code>sorter = np.argsort(index) index = index[sorter] data = data[sorter] </code></pre> This is helpful because now <code>index</code> looks like this: <pre class="prettyprint"><code>0, 0, 0, 1, 1, 1 </code></pre> And <code>data</code> is this: <pre class="prettyprint"><code>0.1, 0.2, 0.2, 0.1, 0.4, 0.1 </code></pre> Adding together runs of consecutive numbers should be easier than processing random locations. Let's start by finding the indices where the runs start: <pre class="prettyprint"><code>runs = np.r_[0, np.flatnonzero(np.diff(index)) + 1] </code></pre> Now you can use the fact that ufuncs like <code>np.add</code> have a partial <code>reduce</code> operation called <code>reduceat</code>. This allows you to sum regions of an array: <pre class="prettyprint"><code>a = np.add.reduceat(data, runs) </code></pre> If <code>I</code> is guaranteed to contain all indices in [0, <code>A.size</code>) at least once, you're done: just assign to <code>A</code> instead of <code>a</code>. If not, you can make the mapping using the fact that the start of each run in <code>index</code> is the target index: <pre class="prettyprint"><code>A = np.zeros(n) A[index[runs]] = a </code></pre> Algorithmic complexity analysis: <ul> <li> <code>ravel</code> is O(1) in time and space if the data is in an array. If it's a list, this is O(MN) in time and space</li> <li> <code>argsort</code> is O(MN log MN) in time and <code>O(MN)</code> in space</li> <li>Indexing by <code>sorter</code> is O(MN) in time and space</li> <li>Computing <code>runs</code> is O(MN) in time and O(MN + M) = O(MN) in space</li> <li> <code>reduceat</code> is a single pass: O(MN) in time, O(M) in space</li> <li>Reassigning <code>A</code> is O(M) in time and space</li> </ul> Total: O(MN log MN) time, O(MN) space TL;DR <pre class="prettyprint"><code>def make_A(D, I, M): index = np.ravel(I) data = np.ravel(D) sorter = np.argsort(index) index = index[sorter] if index[0] < 0 or index[-1] >= M: raise ValueError('Bad indices') data = data[sorter] runs = np.r_[0, np.flatnonzero(np.diff(index)) + 1] a = np.add.reduceat(data, runs) if a.size == M: return a A = np.zeros(M) A[index[runs]] = a return A </code></pre>

If you know the size of A beforehand, as it seems you do, you can simply use add.at: <pre class="prettyprint"><code>import numpy as np D = [[0.1, 0.1, 0.2], [0.2, 0.4, 0.1]] I = [[0, 1, 0], [0, 1, 1]] arr_D = np.array(D) arr_I = np.array(I) A = np.zeros(2) np.add.at(A, arr_I, arr_D) print(A) </code></pre> Output <pre class="prettyprint"><code>[0.5 0.6] </code></pre> If you don't know the size of A, you can use max to compute it: <pre class="prettyprint"><code>A = np.zeros(arr_I.max() + 1) np.add.at(A, arr_I, arr_D) print(A) </code></pre> Output <pre class="prettyprint"><code>[0.5 0.6] </code></pre> The time complexity of this algorithm is O(N), with also space complexity O(N). The: <pre class="prettyprint"><code>arr_I.max() + 1 </code></pre> is what bincount does under the hood, from the documentation: <blockquote> The result of binning the input array. The length of out is equal to np.amax(x)+1. </blockquote> That being said, bincount is at least one order of magnitude faster: <pre class="prettyprint"><code>I = np.random.choice(1000, size=(1000, 3), replace=True) D = np.random.random((1000, 3)) %timeit make_A_with_at(I, D, 1000) 213 µs ± 25 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) %timeit make_A_with_bincount(I, D) 11 µs ± 15.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) </code></pre>

Most efficient way of adding elements given the index list in numpy

Tags:

python

add

numpy

Assume we have a numpy array A with shape (N, ) and a matrix D with shape (M, 3) which has data and another matrix I with shape (M, 3) which has corresponding index of each data element in D. How can we construct A given D and I such that the repeated element indexes are added?

Example:

############# A[I] := D ###################################  
A = [0.5, 0.6]                         # Final Reduced Data Vector
D = [[0.1, 0.1 0.2], [0.2, 0.4, 0.1]]  # Data
I = [[0, 1, 0], [0, 1, 1]]             # Indices

For example:

A[0] = D[0][0] + D[0][2] + D[1][0]     # 0.5 = 0.1 + 0.2 + 0.2

Since in index matrix we have:

I[0][0] = I[0][2] = I[1][0] = 0

Target is to avoid looping over all elements to be efficient for large N, M (10^6-10^9).

934

asked Dec 02 '20 06:12

Roy

3 Answers

I doubt you can get much faster than np.bincount - and notice how the official documentation provides this exact usecase

# Your example
A = [0.5, 0.6]
D = [[0.1, 0.1, 0.2], [0.2, 0.4, 0.1]]
I = [[0, 1, 0], [0, 1, 1]]

# Solution
import numpy as np    
D, I = np.array(D).flatten(), np.array(I).flatten()
print(np.bincount(I, D)) #[0.5 0.6]

119

answered Oct 17 '22 23:10

Leonardus Chen

The shape of I and D doesn't matter: you can clearly ravel the arrays without changing the outcome:

index = np.ravel(I)
data = np.ravel(D)

Now you can sort both arrays according to I:

sorter = np.argsort(index)
index = index[sorter]
data = data[sorter]

This is helpful because now index looks like this:

0, 0, 0, 1, 1, 1

And data is this:

0.1, 0.2, 0.2, 0.1, 0.4, 0.1

Adding together runs of consecutive numbers should be easier than processing random locations. Let's start by finding the indices where the runs start:

runs = np.r_[0, np.flatnonzero(np.diff(index)) + 1]

Now you can use the fact that ufuncs like np.add have a partial reduce operation called reduceat. This allows you to sum regions of an array:

a = np.add.reduceat(data, runs)

If I is guaranteed to contain all indices in [0, A.size) at least once, you're done: just assign to A instead of a. If not, you can make the mapping using the fact that the start of each run in index is the target index:

A = np.zeros(n)
A[index[runs]] = a

Algorithmic complexity analysis:

ravel is O(1) in time and space if the data is in an array. If it's a list, this is O(MN) in time and space
argsort is O(MN log MN) in time and O(MN) in space
Indexing by sorter is O(MN) in time and space
Computing runs is O(MN) in time and O(MN + M) = O(MN) in space
reduceat is a single pass: O(MN) in time, O(M) in space
Reassigning A is O(M) in time and space

Total: O(MN log MN) time, O(MN) space

TL;DR

def make_A(D, I, M):
    index = np.ravel(I)
    data = np.ravel(D)
    sorter = np.argsort(index)
    index = index[sorter]

    if index[0] < 0 or index[-1] >= M:
        raise ValueError('Bad indices')

    data = data[sorter]
    runs = np.r_[0, np.flatnonzero(np.diff(index)) + 1]
    a = np.add.reduceat(data, runs)
    if a.size == M:
        return a
    A = np.zeros(M)
    A[index[runs]] = a
    return A

answered Oct 18 '22 00:10

Mad Physicist

If you know the size of A beforehand, as it seems you do, you can simply use add.at:

import numpy as np

D = [[0.1, 0.1, 0.2], [0.2, 0.4, 0.1]]
I = [[0, 1, 0], [0, 1, 1]]

arr_D = np.array(D)
arr_I = np.array(I)

A = np.zeros(2)

np.add.at(A, arr_I, arr_D)

print(A)

Output

[0.5 0.6]

If you don't know the size of A, you can use max to compute it:

A = np.zeros(arr_I.max() + 1)
np.add.at(A, arr_I, arr_D)
print(A)

Output

[0.5 0.6]

The time complexity of this algorithm is O(N), with also space complexity O(N).

The:

arr_I.max() + 1

is what bincount does under the hood, from the documentation:

The result of binning the input array. The length of out is equal to np.amax(x)+1.

That being said, bincount is at least one order of magnitude faster:

I = np.random.choice(1000, size=(1000, 3), replace=True)
D = np.random.random((1000, 3))
%timeit make_A_with_at(I, D, 1000)
213 µs ± 25 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit make_A_with_bincount(I, D)
11 µs ± 15.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

answered Oct 17 '22 23:10

Dani Mesejo

Related questions
                            
                                How to extract specific columns without index no. and with all the rows in python dataframe?
                            
                                rio.plot.show with colorbar?
                            
                                How does mypy use typing.TYPE_CHECKING to resolve the circular import annotation problem?
                            
                                Number of combinations less than 100
                            
                                dash app refusing to start: '127.0.0.1 refused to connect.'
                            
                                Word Cloud built out of TF-IDF Vectorizer function
                            
                                Why does a nested loop perform much faster than the flattened one?
                            
                                How do I fix a deprecated module for plotly.plotly
                            
                                Why do I have "ModuleNotFoundError: No module named 'scipy.special.cython_special'" when I don't even use cython?
                            
                                breakpoint in except clause doesn't have access to the bound exception
                            
                                Building a table with the data from scratch Python
                            
                                How to build list of tasks for asyncio.gather in Python 3.8
                            
                                How to change the number or rows and columns in my seaborn catplot
                            
                                Why protobuf is smaller in memory than normal dict+list in python?
                            
                                How to call an api from another api in fastapi?
                            
                                avoid division by zero in numpy.where()
                            
                                install python 3.7 via google colab as default python
                            
                                How can I make a matplotlib plot in Google Colab interactive
                            
                                How to split cell in VSCode Jupyter Notebook?
                            
                                Invalid Syntax jose.py

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With