Finding Patterns in a Numpy Array

Question

I am trying to find patterns in a numpy array, called values. I'd like to return the starting index position of the pattern. I know I could iterative over each element and check whether that element and the next one match the pattern, but over a large dataset that is incredibly inefficient and am looking for a better alternative.

I've got a working solution using np.where for searching for a single value, but I can't get it to work with finding a pattern or two numbers.

Example:

import numpy as np
values = np.array([0,1,2,1,2,4,5,6,1,2,1])
searchval = [1,2]
print  np.where(values == searchval)[0]

Output:

[]

Expected Output:

[1, 3, 8]

hpaulj · Accepted Answer

Here's a straight forward approach to using where. Start with a logical expression that finds the matches:

In [670]: values = np.array([0,1,2,1,2,4,5,6,1,2,1])
     ...: searchval = [1,2]
     ...: 
In [671]: (values[:-1]==searchval[0]) & (values[1:]==searchval[1])
Out[671]: array([False,  True, False,  True, False, False, False, False,  True, False], dtype=bool)
In [672]: np.where(_)
Out[672]: (array([1, 3, 8], dtype=int32),)

That could be generalized into a loop that operates on multiple searchval. Getting the slice range correct will take some fiddling. The roll suggested in another answer might be easier, but I suspect a bit slower.

As long as searchval is small compared to values this general approach should be efficient. There is a np.in1d that does this sort of match, but with a or test. So it isn't applicable. But it too uses this iterative approach is the searchval list is small enough.

Generalized slicing

In [716]: values
Out[716]: array([0, 1, 2, 1, 2, 4, 5, 6, 1, 2, 1])
In [717]: searchvals=[1,2,1]
In [718]: idx = [np.s_[i:m-n+1+i] for i in range(n)]
In [719]: idx
Out[719]: [slice(0, 9, None), slice(1, 10, None), slice(2, 11, None)]
In [720]: [values[idx[i]] == searchvals[i] for i in range(n)]
Out[720]: 
[array([False,  True, False,  True, False, False, False, False,  True], dtype=bool),
 array([False,  True, False,  True, False, False, False, False,  True], dtype=bool),
 array([False,  True, False, False, False, False,  True, False,  True], dtype=bool)]
In [721]: np.all(_, axis=0)
Out[721]: array([False,  True, False, False, False, False, False, False,  True], dtype=bool)
In [722]: np.where(_)
Out[722]: (array([1, 8], dtype=int32),)

I used the intermediate np.s_ to look at the slices and make sure they look reasonable.

as_strided

An advanced trick would be to use as_strided to construct the 'rolled' array and perform a 2d == test on that. as_strided is neat but tricky. To use it correctly you have to understand strides, and get the shape correct.

In [740]: m,n = len(values), len(searchvals)
In [741]: values.shape
Out[741]: (11,)
In [742]: values.strides
Out[742]: (4,)
In [743]: 
In [743]: M = as_strided(values, shape=(n,m-n+1),strides=(4,4))
In [744]: M
Out[744]: 
array([[0, 1, 2, 1, 2, 4, 5, 6, 1],
       [1, 2, 1, 2, 4, 5, 6, 1, 2],
       [2, 1, 2, 4, 5, 6, 1, 2, 1]])
In [745]: M == np.array(searchvals)[:,None]
Out[745]: 
array([[False,  True, False,  True, False, False, False, False,  True],
       [False,  True, False,  True, False, False, False, False,  True],
       [False,  True, False, False, False, False,  True, False,  True]], dtype=bool)
In [746]: np.where(np.all(_,axis=0))
Out[746]: (array([1, 8], dtype=int32),)

Ed Smith · Answer

Couldn't you simply use np.where (assuming this is the optimal way to find an element) and then only check pattens which satisfy the first condition.

import numpy as np
values = np.array([0,1,2,1,2,4,5,6,1,2,1])
searchval = [1,2]
N = len(searchval)
possibles = np.where(values == searchval[0])[0]

solns = []
for p in possibles:
    check = values[p:p+N]
    if np.all(check == searchval):
        solns.append(p)

print(solns)

jnsod · Answer

I think this does the job:

np.where((values == 1) & (np.roll(values,-1) == 2))[0]

Finding Patterns in a Numpy Array

Tags:

python

numpy

python-2.7

user2242044

3 Answers

Generalized slicing

as_strided

hpaulj

Ed Smith

jnsod

Recent Activity

Donate For Us

Finding Patterns in a Numpy Array

Tags:

python

numpy

python-2.7

user2242044

3 Answers

Generalized slicing

as_strided

hpaulj

Ed Smith

jnsod

Related questions

Recent Activity

Donate For Us