Vectorizing a Nested Loop

Question

I am looking to vectorize a nested loop, which will work on a list of 300,000 lists, with each of these lists containing 3 values. The nested loop compares the values of each of the lists with the corresponding values in the other lists, and will only append the list indices which have corresponding values having a maximum difference of 0.1 between them. Thus, a list containing [0.234, 0.456, 0.567] and a list containing [0.246, 0.479, 0.580] would fall in this category, since their corresponding values (i.e. 0.234 and 0.246; 0.456 and 0.479; 0.567 and 0.580) have a difference of less than 0.1 between them.

I currently use the following nested loop to do this, but it would currently take approximately 58 hours to complete (a total of 90 trillion iterations);

import numpy as np
variable = np.random.random((300000,3)).tolist()
out1=list()
out2=list()
for i in range(0:300000):
    for j in range(0:300000):
        if ((i<j) and ((abs(variable[i][0]-variable[j][0]))<0.1) and ((abs(variable[i][1]-variable[j] [1]))<0.1) and ((abs(variable[i][2]-variable[j][2]))<0.1)):
        out1.append(i)  
        out2.append(j)

Eelco Hoogendoorn · Accepted Answer

Look into scipy.spatial; it has a lot of functionality for solving such spatial queries efficiently; KDTrees in particular, ie:

import scipy.spatial
out = scipy.spatial.cKDTree(variable).query_pairs(r=0.1, p=np.infinity)

Vectorizing a Nested Loop

Tags:

python

vectorization

numpy

JBorg

1 Answers

Eelco Hoogendoorn

Recent Activity

Donate For Us

Vectorizing a Nested Loop

Tags:

python

vectorization

numpy

JBorg

1 Answers

Eelco Hoogendoorn

Related questions

Recent Activity

Donate For Us