Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient identification of adjacent elements in numpy matrix

Tags:

python

numpy

I have a 100 by 100 numpy matrix. The matrix is mostly filled with zeros, but also contains some number of ints. For example:

[0 0 0 0 0 0 0 1]
[0 2 2 0 0 0 0 0]
[0 0 2 0 0 0 0 0]  False
[0 0 0 0 0 0 0 0]
[0 3 3 0 0 0 0 0]

What is the most efficient way to identify whether the matrix contains any number of adjacent ints of distinct types?

The above example would return False. Here is a True example, with the row containing the adjacency indicated:

[0 0 0 0 0 0 0 1]
[0 2 2 1 1 0 0 0]   <----  True
[0 0 2 0 0 0 0 0]  
[0 0 0 0 0 0 0 0]
[0 3 3 0 0 0 0 0]

Diagonals do not count as being adjacent. So this example would also return False:

[0 0 0 1 1 1 1 1]
[0 2 2 0 1 0 0 0]
[0 0 2 0 0 0 0 0]   False
[0 3 0 0 0 0 0 0]
[3 3 3 0 0 0 0 0]

I do not need to identify the position of the adjacency, just whether it exists of not.

Currently, I cannot do better than finding each no-zero element in the matrix and then checking its 4 flanking elements.

Thanks for all the great answers.

like image 703
Chris Parry Avatar asked Dec 19 '16 02:12

Chris Parry


2 Answers

If you can use scipy this will be quite easy using ndimage.label and ndimage.labeled_comprehension:

import numpy as np
from scipy import ndimage

def multiple_unique_item(array):
    return len(np.unique(array)) > 1

def adjacent_diff(array):
    labeled_array, num_labels = ndimage.label(array)
    labels = np.arange(1, num_labels+1)
    any_multiple = ndimage.labeled_comprehension(array, labeled_array, labels, 
                                                 multiple_unique_item, bool, 0)
    return any_multiple.any()

label defaults to label neighboring values different from 0 without diagonals. Then the comprehension passes all values associated with a label to the helper function - which checks if more than one unique value is present. Finally it checks if any label had more than one value and returns this.

To test this on your test input arrays:

arr1 = np.array([[0,0,0,0,0,0,0,1],
                 [0,2,2,1,1,0,0,0],  
                 [0,0,2,0,0,0,0,0],
                 [0,0,0,0,0,0,0,0],
                 [0,3,3,0,0,0,0,0]])

arr2 = np.array([[0,0,0,1,1,1,1,1],
                 [0,2,2,0,1,0,0,0],
                 [0,0,2,0,0,0,0,0],  
                 [0,3,0,0,0,0,0,0],
                 [3,3,3,0,0,0,0,0]])

arr3 = np.array([[0,0,0,0,0,0,0,1],
                 [0,2,2,0,0,0,0,0],
                 [0,0,2,0,0,0,0,0],  
                 [0,0,0,0,0,0,0,0],
                 [0,3,3,0,0,0,0,0]])

>>> adjacent_diff(arr1)
True
>>> adjacent_diff(arr2)
False
>>> adjacent_diff(arr3)
False
like image 67
MSeifert Avatar answered Nov 03 '22 02:11

MSeifert


Looking at the description of your problem, it's probably not too much computational effort to check each possible nonzero integer value for its position in the array, and see if there are intersections. Now, this is generally overkill, but on your scale it might work: you can get the indices of each integer collection, and compute their distances using scipy.spatial.distance.cdist. I'm sure that some smart solution based on diff or something else is more efficient, but I had fun anyway:

import numpy as np
from scipy.spatial.distance import cdist
from itertools import combinations

M1 = np.array(
[[0,0,0,0,0,0,0,1],
 [0,2,2,1,1,0,0,0],  
 [0,0,2,0,0,0,0,0],
 [0,0,0,0,0,0,0,0],
 [0,3,3,0,0,0,0,0]])

M2 = np.array(
[[0,0,0,1,1,1,1,1],
 [0,2,2,0,1,0,0,0],
 [0,0,2,0,0,0,0,0],  
 [0,3,0,0,0,0,0,0],
 [3,3,3,0,0,0,0,0]])

def overlaps_eh(M):
    uniques = np.delete(np.unique(M),0) # get integers present
    unival_inds = [np.transpose(np.where(M==unival)) for unival in uniques]
    # unival_inds[k] contains the i,j indices of each element with the kth unique value

    for i1,i2 in combinations(range(len(unival_inds)),2):
        if np.any(cdist(unival_inds[i1],unival_inds[i2],'cityblock')==1):
            return True
    # if we're here: no adjacencies
    return False

First we collect the nonzero unique matrix elements into the array uniques. Then for each unique value we find the i,j indices of each element in the input array that has this value. Then we check each pair of unique values (generated using itertools.combinations), and use scipy.spatial.distance.cdist to measure the pair-wise distance of each pair of matrix elements. Using the Manhattan distance, if any pair of elements has distance 1, they are adjacent. So we only have to return True in case any of these distances is 1, otherwise we return False.

like image 29