Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: Using list comprehensions to filter a list by a list of substrings

I think this is a simple question, so I'll just go straight to an example.

Given these two lists:

x = ['a', 'ab', 'abc', 'bc', 'c', 'ac']
y = ['a', 'b']

How do I write a list comprehension to filter list x in such a way that the result would be:

result = ['c']

I want a list comprehension that excludes any partial matches of the strings in y to the strings in x. For example, 'a' in y would match 'a', 'ab', 'abc', and 'ac' in x.

This comprehension only matches entire strings: result = [r for r in x if r not in y]

If this has already been asked I'll gladly accept a link to a previous answer. That said, I haven't found one on SO yet.

like image 956
craignewkirk Avatar asked Dec 07 '16 22:12

craignewkirk


People also ask

How to use list comprehension on a string?

Let’s see another example where list comprehension is used on a string. LIst comprehension can identify the input if a string or list or tuple, etc., and work accordingly as it does for a string. We can also give conditionals to the list comprehensions and let the element add to the new list only if the condition is matched.

What is list comprehension in Python?

List comprehension offers a shorter syntax when you want to create a new list based on the values of an existing list. Based on a list of fruits, you want a new list, containing only the fruits with the letter "a" in the name. Without list comprehension you will have to write a for statement with a conditional test inside:

How to filter out string that contains string in substr in Python?

Given two lists of strings string and substr, write a Python program to filter out all the strings in string that contains string in substr. We can Use list comprehension along with in operator to check if the string in ‘substr’ is contained in ‘string’ or not.

How do you filter for values greater than 10 in Python?

Here, ‘i’ is mapped to the tuple, and for each value of ‘i’ in the tuple, the condition of ‘i<10 is checked for filtering. Then this mapping is converted to a list using the list () function and this list consist only of those values that are greater than 10.


Video Answer


2 Answers

Use all:

result = [r for r in x if all(z not in r for z in y)]

Or any:

result = [r for r in x if not any(z in r for z in y)]
like image 139
Chris Martin Avatar answered Sep 30 '22 00:09

Chris Martin


This is a job for the any built-in.

>>> x = ['a', 'ab', 'abc', 'bc', 'c', 'ac']
>>> y = ['a', 'b']
>>> [r for r in x if not any(s in r for s in y)]
['c']

s in r does the partial match you want, for s in y checks all elements of y, and any is true if there was any match. Then we just invert it.

This is quadratic, O(len(x) * len(y)). If y is long, it may be more efficient to synthesize a regexp:

>>> import re
>>> yy = re.compile("|".join(re.escape(s) for s in y))
>>> [r for r in x if not yy.search(r)]
['c']

which should be merely O(len(x) + len(y)).

like image 36
zwol Avatar answered Sep 30 '22 01:09

zwol