How can I verify if one list is a subset of another?

People also ask

How do you check if an item in one list is in another list Python?

There are 2 ways to understand check if the list contains elements of another list. First, use all() functions to check if a Python list contains all the elements of another list. And second, use any() function to check if the list contains any elements of another one.

How do you check if a set is a subset in Python?

Python Set issubset() The issubset() method returns True if set A is the subset of B , i.e. if all the elements of set A are present in set B . Else, it returns False .

>>> a = [1, 3, 5]
>>> b = [1, 3, 5, 8]
>>> c = [3, 5, 9]
>>> set(a) <= set(b)
True
>>> set(c) <= set(b)
False

>>> a = ['yes', 'no', 'hmm']
>>> b = ['yes', 'no', 'hmm', 'well']
>>> c = ['sorry', 'no', 'hmm']
>>> 
>>> set(a) <= set(b)
True
>>> set(c) <= set(b)
False

Use set.issubset

Example:

a = {1,2}
b = {1,2,3}
a.issubset(b) # True

a = {1,2,4}
b = {1,2,3}
a.issubset(b) # False

The performant function Python provides for this is set.issubset. It does have a few restrictions that make it unclear if it's the answer to your question, however.

A list may contain items multiple times and has a specific order. A set does not. Additionally, sets only work on hashable objects.

Are you asking about subset or subsequence (which means you'll want a string search algorithm)? Will either of the lists be the same for many tests? What are the datatypes contained in the list? And for that matter, does it need to be a list?

Your other post intersect a dict and list made the types clearer and did get a recommendation to use dictionary key views for their set-like functionality. In that case it was known to work because dictionary keys behave like a set (so much so that before we had sets in Python we used dictionaries). One wonders how the issue got less specific in three hours.

one = [1, 2, 3]
two = [9, 8, 5, 3, 2, 1]

all(x in two for x in one)

Explanation: Generator creating booleans by looping through list one checking if that item is in list two. all() returns True if every item is truthy, else False.

There is also an advantage that all return False on the first instance of a missing element rather than having to process every item.

Assuming the items are hashable

>>> from collections import Counter
>>> not Counter([1, 2]) - Counter([1])
False
>>> not Counter([1, 2]) - Counter([1, 2])
True
>>> not Counter([1, 2, 2]) - Counter([1, 2])
False

If you don't care about duplicate items eg. [1, 2, 2] and [1, 2] then just use:

>>> set([1, 2, 2]).issubset([1, 2])
True

Is testing equality on the smaller list after an intersection the fastest way to do this?

.issubset will be the fastest way to do it. Checking the length before testing issubset will not improve speed because you still have O(N + M) items to iterate through and check.

One more solution would be to use a intersection.

one = [1, 2, 3]
two = [9, 8, 5, 3, 2, 1]

set(one).intersection(set(two)) == set(one)

The intersection of the sets would contain of set one

(OR)

one = [1, 2, 3]
two = [9, 8, 5, 3, 2, 1]

set(one) & (set(two)) == set(one)

Set theory is inappropriate for lists since duplicates will result in wrong answers using set theory.

For example:

a = [1, 3, 3, 3, 5]
b = [1, 3, 3, 4, 5]
set(b) > set(a)

has no meaning. Yes, it gives a false answer but this is not correct since set theory is just comparing: 1,3,5 versus 1,3,4,5. You must include all duplicates.

Instead you must count each occurrence of each item and do a greater than equal to check. This is not very expensive, because it is not using O(N^2) operations and does not require quick sort.

#!/usr/bin/env python

from collections import Counter

def containedInFirst(a, b):
  a_count = Counter(a)
  b_count = Counter(b)
  for key in b_count:
    if a_count.has_key(key) == False:
      return False
    if b_count[key] > a_count[key]:
      return False
  return True


a = [1, 3, 3, 3, 5]
b = [1, 3, 3, 4, 5]
print "b in a: ", containedInFirst(a, b)

a = [1, 3, 3, 3, 4, 4, 5]
b = [1, 3, 3, 4, 5]
print "b in a: ", containedInFirst(a, b)

Then running this you get:

$ python contained.py 
b in a:  False
b in a:  True

one = [1, 2, 3]
two = [9, 8, 5, 3, 2, 1]

set(x in two for x in one) == set([True])

If list1 is in list 2:

(x in two for x in one) generates a list of True.
when we do a set(x in two for x in one) has only one element (True).

Related questions
                            
                                PyLint message: logging-format-interpolation
                            
                                live output from subprocess command
                            
                                How to remove specific substrings from a set of strings in Python?
                            
                                How to get rid of "Unnamed: 0" column in a pandas DataFrame read in from CSV file?
                            
                                What's the function like sum() but for multiplication? product()?
                            
                                error: command 'gcc' failed with exit status 1 while installing eventlet
                            
                                Load data from txt with pandas
                            
                                How to loop backwards in python? [duplicate]
                            
                                Compute list difference
                            
                                TransactionManagementError "You can't execute queries until the end of the 'atomic' block" while using signals, but only during Unit Testing
                            
                                Using headers with the Python requests library's get method
                            
                                Initialising an array of fixed size in Python [duplicate]
                            
                                How to switch position of two items in a Python list?
                            
                                How do I create a slug in Django?
                            
                                How to increment datetime by custom months in python without using library [duplicate]
                            
                                Cannot install Lxml on Mac OS X 10.9
                            
                                Pandas create empty DataFrame with only column names
                            
                                How to create a custom string representation for a class object?
                            
                                TypeError: not all arguments converted during string formatting python
                            
                                Can I serve multiple clients using just Flask app.run() as standalone?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How can I verify if one list is a subset of another?

Tags:

python

list

People also ask

Recent Activity

Donate For Us