My very first post and question here...
So, let list_a
be the list of lists:
list_a = [[2,7,8], [3,4,2], [5,10], [4], [2,3,5]...]
Let list_b
be another list of integers: list_b = [5,7]
I need to exclude all lists in list_a
, whose items include at least one item from list_b
. The result from example above schould look like list_c = [[3,4,2], [4]...]
If list_b
was not a list but a single number b
, then one could define list_c in one line as:
list_c = [x for x in list_a if not b in x]
I am wondering, if it is possible to write an elegant one-liner also for the list list_b
with several values in it. Of course, I can just loop through all list_b
's values, but may be there exists a faster option?
Let's first consider the task of checking an individual element of list_a
- such as [2,7,8]
- because no matter what, we're conceptually doing to need a way to do that, and then we're going to apply that to the list with a list comprehension. I'll use a
as the name for such a list, and b
for an element of list_b
.
The straightforward way to write this is using the any
builtin, which works elegantly in combination with generator expressions: any(b in a for b in list_b)
.
The logic is simple: we create a generator expression (like a lazily-evaluated list comprehension) to represent the result of the b in a
check applied to each b in list_b
. We create those by replacing the []
with ()
; but due to a special syntax rule we may drop these when using it as the sole argument to a function. Then any
does exactly what it sounds like: it checks (with early bail-out) whether any of the elements in the iterable (which includes generator expressions) is truthy.
However, we can likely do better by taking advantage of set intersection. The key insight is that the test we are trying to do is symmetric; considering the test between a
and list_b
(and coming up with another name for elements of a
), we could equally have written any(x in list_b for x in a)
, except that it's harder to understand that.
Now, it doesn't help to make a set
from a
, because we have to iterate over a
anyway in order to do that. (The generator expression does that implicitly; in
used for list membership requires iteration.) However, if we make a set
from list_b
, then we can do that once, ahead of time, and just have any(x in set_b for x in a)
.
But that, in turn, is a) as described above, hard to understand; and b) overlooking the built-in machinery of set
s. The operator &
normally used for set intersection requires a set
on both sides, but the named method .intersection
does not. Thus, set_b.intersection(a)
does the trick.
Putting it all together, we get:
set_b = set(list_b)
list_c = [a for a in list_a if not set_b.intersection(a)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With