Selecting elements in numpy array using regular expressions

Tags:

One may select elements in numpy arrays as follows

a = np.random.rand(100)
sel = a > 0.5 #select elements that are greater than 0.5
a[sel] = 0 #do something with the selection

b = np.array(list('abc abc abc'))
b[b==a] = 'A' #convert all the a's to A's

This property is used by the np.where function to retrive indices:

indices = np.where(a>0.9)

What I would like to do is to be able to use regular expressions in such element selection. For example, if I want to select elements from b above that match the [Aab] regexp, I need to write the following code:

regexp = '[Ab]'
selection = np.array([bool(re.search(regexp, element)) for element in b])

This looks too verbouse for me. Is there any shorter and more elegant way to do this?

762

asked Jul 06 '11 11:07

Boris Gorelik

1 Answers

There's some setup involved here, but unless numpy has some kind of direct support for regular expressions that I don't know about, then this is the most "numpytonic" solution. It tries to make iteration over the array more efficient than standard python iteration.

import numpy as np
import re

r = re.compile('[Ab]')
vmatch = np.vectorize(lambda x:bool(r.match(x)))

A = np.array(list('abc abc abc'))
sel = vmatch(A)

answered Sep 18 '22 14:09

Paul

Related questions
                            
                                How do I extract test coverage from the istanbul text-summary reporter with a regex?
                            
                                How to use regex in selenium locators
                            
                                How to split a String array?
                            
                                Is there a way to capture each group if multiple occurrences are matched?
                            
                                Are .NET's regular expressions Turing complete?
                            
                                Perl regex replace count
                            
                                Regex to only match specific characters preceded by a space or nothing (start of line)
                            
                                What exactly does $ match in Perl?
                            
                                python regex error: unbalanced parenthesis
                            
                                Parsing big numbers in JSON to strings [duplicate]
                            
                                is dash a special character in R regex?
                            
                                Django: Filtering a model that contains a field that stores Regex
                            
                                Parsing signatures with regex, having "fun" with array return values
                            
                                Can one automatically get Intellij's regex assistance for one's own regex parameters
                            
                                Python regular expression to remove all square brackets and their contents
                            
                                matchAll Throws error when g flag is missing now?
                            
                                apache mod_rewrite one rule for any number of possibilities
                            
                                How to use Visual Studio Find and Replace to convert upper-case char to lower-case with Regex
                            
                                Javascript RegEx non-capturing prefix
                            
                                How to use regex match end of line in Windows

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Selecting elements in numpy array using regular expressions

Tags:

python

regex

numpy

Boris Gorelik

People also ask

1 Answers

Paul

Recent Activity

Donate For Us