Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Check if substring is in a list of strings?

I have found some answers to this question before, but they seem to be obsolete for the current Python versions (or at least they don't work for me).

I want to check if a substring is contained in a list of strings. I only need the boolean result.

I found this solution:

word_to_check = 'or'
wordlist = ['yellow','orange','red']

result = any(word_to_check in word for word in worldlist)

From this code I would expect to get a True value. If the word was "der", then the output should be False.

However, the result is a generator function, and I can't find a way to get the True value.

Any idea?

like image 592
Álvaro Avatar asked May 05 '13 00:05

Álvaro


3 Answers

Posted code

The OP's posted code using any() is correct and should work. The spelling of "worldlist" needs to be fixed though.

Alternate approach with str.join()

That said, there is a simple and fast solution to be had by using the substring search on a single combined string:

>>> wordlist = ['yellow','orange','red']
>>> combined = '\t'.join(wordlist)

>>> 'or' in combined
True
>>> 'der' in combined
False

For short wordlists, this is several times faster than the approach using any.

And if the combined string can be precomputed before the search, the in-operator search will always beat the any approach even for large wordlists.

Alternate approach with sets

The O(n) search speed can be reduced to O(1) if a substring set is precomputed in advance and if we don't mind using more memory.

Precomputed step:

from itertools import combinations

def substrings(word):
    for i, j in combinations(range(len(word) + 1), 2):
        yield word[i : j]

wordlist = ['yellow','orange','red']
word_set = set().union(*map(substrings, wordlist))

Fast O(1) search step:

>>> 'or' in word_set
True
>>> 'der' in word_set
False
like image 158
Raymond Hettinger Avatar answered Oct 28 '22 15:10

Raymond Hettinger


You can import any from __builtin__ in case it was replaced by some other any:

>>> from  __builtin__ import any as b_any
>>> lst = ['yellow', 'orange', 'red']
>>> word = "or"
>>> b_any(word in x for x in lst)
True

Note that in Python 3 __builtin__ has been renamed to builtins.

like image 25
Ashwini Chaudhary Avatar answered Oct 28 '22 13:10

Ashwini Chaudhary


You could use next instead:

colors = ['yellow', 'orange', 'red'] 
search = "or"

result = next((True for color in colors if search in color), False)

print(result) # True

To show the string that contains the substring:

colors = ['yellow', 'orange', 'red'] 
search = "or"

result = [color for color in colors if search in color]  

print(result) # Orange
like image 22
stderr Avatar answered Oct 28 '22 14:10

stderr