Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Selecting elements in numpy array using regular expressions

One may select elements in numpy arrays as follows

a = np.random.rand(100)
sel = a > 0.5 #select elements that are greater than 0.5
a[sel] = 0 #do something with the selection

b = np.array(list('abc abc abc'))
b[b==a] = 'A' #convert all the a's to A's

This property is used by the np.where function to retrive indices:

indices = np.where(a>0.9)

What I would like to do is to be able to use regular expressions in such element selection. For example, if I want to select elements from b above that match the [Aab] regexp, I need to write the following code:

regexp = '[Ab]'
selection = np.array([bool(re.search(regexp, element)) for element in b])

This looks too verbouse for me. Is there any shorter and more elegant way to do this?

like image 762
Boris Gorelik Avatar asked Jul 06 '11 11:07

Boris Gorelik


People also ask

How do I select part of an array in NumPy?

Slice a Range of Values from Two-dimensional Numpy Arrays For example, you can use the index [0:1, 0:2] to select the elements in first row, first two columns. You can flip these index values to select elements in the first two rows, first column.

How do I search for an element in NumPy?

Using ndenumerate() function to find the Index of value It is usually used to find the first occurrence of the element in the given numpy array.

How do you select an element in an array?

You select a value from an array by referring to the index of its element. Array elements (the things inside your array), are numbered/indexed from 0 to length-1 of your array.

How do you select an array of elements in Python?

To select an element from Numpy Array , we can use [] operator i.e. It will return the element at given index only.


1 Answers

There's some setup involved here, but unless numpy has some kind of direct support for regular expressions that I don't know about, then this is the most "numpytonic" solution. It tries to make iteration over the array more efficient than standard python iteration.

import numpy as np
import re

r = re.compile('[Ab]')
vmatch = np.vectorize(lambda x:bool(r.match(x)))

A = np.array(list('abc abc abc'))
sel = vmatch(A)
like image 67
Paul Avatar answered Sep 18 '22 14:09

Paul