Check if list1 contains any elements of list2 using any() Python any() function checks if any Element of given Iterable is True. So, convert the list2 to Iterable and for each element in Iterable i.e. list2 check if any element exists in list1.
Using Count() The python list method count() returns count of how many times an element occurs in list. So if we have the same element repeated in the list then the length of the list using len() will be same as the number of times the element is present in the list using the count().
There are 2 ways to understand check if the list contains elements of another list. First, use all() functions to check if a Python list contains all the elements of another list. And second, use any() function to check if the list contains any elements of another one.
Using list.sort() method sorts the two lists and the == operator compares the two lists item by item which means they have equal data items at equal positions. This checks if the list contains equal data item values but it does not take into account the order of elements in the list.
This does what you want, and will work in nearly all cases:
>>> all(x in ['b', 'a', 'foo', 'bar'] for x in ['a', 'b'])
True
The expression 'a','b' in ['b', 'a', 'foo', 'bar']
doesn't work as expected because Python interprets it as a tuple:
>>> 'a', 'b'
('a', 'b')
>>> 'a', 5 + 2
('a', 7)
>>> 'a', 'x' in 'xerxes'
('a', True)
There are other ways to execute this test, but they won't work for as many different kinds of inputs. As Kabie points out, you can solve this problem using sets...
>>> set(['a', 'b']).issubset(set(['a', 'b', 'foo', 'bar']))
True
>>> {'a', 'b'} <= {'a', 'b', 'foo', 'bar'}
True
...sometimes:
>>> {'a', ['b']} <= {'a', ['b'], 'foo', 'bar'}
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
Sets can only be created with hashable elements. But the generator expression all(x in container for x in items)
can handle almost any container type. The only requirement is that container
be re-iterable (i.e. not a generator). items
can be any iterable at all.
>>> container = [['b'], 'a', 'foo', 'bar']
>>> items = (i for i in ('a', ['b']))
>>> all(x in [['b'], 'a', 'foo', 'bar'] for x in items)
True
In many cases, the subset test will be faster than all
, but the difference isn't shocking -- except when the question is irrelevant because sets aren't an option. Converting lists to sets just for the purpose of a test like this won't always be worth the trouble. And converting generators to sets can sometimes be incredibly wasteful, slowing programs down by many orders of magnitude.
Here are a few benchmarks for illustration. The biggest difference comes when both container
and items
are relatively small. In that case, the subset approach is about an order of magnitude faster:
>>> smallset = set(range(10))
>>> smallsubset = set(range(5))
>>> %timeit smallset >= smallsubset
110 ns ± 0.702 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
>>> %timeit all(x in smallset for x in smallsubset)
951 ns ± 11.5 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
This looks like a big difference. But as long as container
is a set, all
is still perfectly usable at vastly larger scales:
>>> bigset = set(range(100000))
>>> bigsubset = set(range(50000))
>>> %timeit bigset >= bigsubset
1.14 ms ± 13.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
>>> %timeit all(x in bigset for x in bigsubset)
5.96 ms ± 37 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Using subset testing is still faster, but only by about 5x at this scale. The speed boost is due to Python's fast c
-backed implementation of set
, but the fundamental algorithm is the same in both cases.
If your items
are already stored in a list for other reasons, then you'll have to convert them to a set before using the subset test approach. Then the speedup drops to about 2.5x:
>>> %timeit bigset >= set(bigsubseq)
2.1 ms ± 49.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
And if your container
is a sequence, and needs to be converted first, then the speedup is even smaller:
>>> %timeit set(bigseq) >= set(bigsubseq)
4.36 ms ± 31.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
The only time we get disastrously slow results is when we leave container
as a sequence:
>>> %timeit all(x in bigseq for x in bigsubseq)
184 ms ± 994 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
And of course, we'll only do that if we must. If all the items in bigseq
are hashable, then we'll do this instead:
>>> %timeit bigset = set(bigseq); all(x in bigset for x in bigsubseq)
7.24 ms ± 78 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
That's just 1.66x faster than the alternative (set(bigseq) >= set(bigsubseq)
, timed above at 4.36).
So subset testing is generally faster, but not by an incredible margin. On the other hand, let's look at when all
is faster. What if items
is ten-million values long, and is likely to have values that aren't in container
?
>>> %timeit hugeiter = (x * 10 for bss in [bigsubseq] * 2000 for x in bss); set(bigset) >= set(hugeiter)
13.1 s ± 167 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
>>> %timeit hugeiter = (x * 10 for bss in [bigsubseq] * 2000 for x in bss); all(x in bigset for x in hugeiter)
2.33 ms ± 65.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Converting the generator into a set turns out to be incredibly wasteful in this case. The set
constructor has to consume the entire generator. But the short-circuiting behavior of all
ensures that only a small portion of the generator needs to be consumed, so it's faster than a subset test by four orders of magnitude.
This is an extreme example, admittedly. But as it shows, you can't assume that one approach or the other will be faster in all cases.
Most of the time, converting container
to a set is worth it, at least if all its elements are hashable. That's because in
for sets is O(1), while in
for sequences is O(n).
On the other hand, using subset testing is probably only worth it sometimes. Definitely do it if your test items are already stored in a set. Otherwise, all
is only a little slower, and doesn't require any additional storage. It can also be used with large generators of items, and sometimes provides a massive speedup in that case.
Another way to do it:
>>> set(['a','b']).issubset( ['b','a','foo','bar'] )
True
If you want to check all of your input matches,
>>> all(x in ['b', 'a', 'foo', 'bar'] for x in ['a', 'b'])
if you want to check at least one match,
>>> any(x in ['b', 'a', 'foo', 'bar'] for x in ['a', 'b'])
I'm pretty sure in
is having higher precedence than ,
so your statement is being interpreted as 'a', ('b' in ['b' ...])
, which then evaluates to 'a', True
since 'b'
is in the array.
See previous answer for how to do what you want.
The Python parser evaluated that statement as a tuple, where the first value was 'a'
, and the second value is the expression 'b' in ['b', 'a', 'foo', 'bar']
(which evaluates to True
).
You can write a simple function do do what you want, though:
def all_in(candidates, sequence):
for element in candidates:
if element not in sequence:
return False
return True
And call it like:
>>> all_in(('a', 'b'), ['b', 'a', 'foo', 'bar'])
True
I would say we can even leave those square brackets out.
array = ['b', 'a', 'foo', 'bar']
all([i in array for i in 'a', 'b'])
[x for x in ['a','b'] if x in ['b', 'a', 'foo', 'bar']]
The reason I think this is better than the chosen answer is that you really don't need to call the 'all()' function. Empty list evaluates to False in IF statements, non-empty list evaluates to True.
if [x for x in ['a','b'] if x in ['b', 'a', 'foo', 'bar']]:
...Do something...
Example:
>>> [x for x in ['a','b'] if x in ['b', 'a', 'foo', 'bar']]
['a', 'b']
>>> [x for x in ['G','F'] if x in ['b', 'a', 'foo', 'bar']]
[]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With