Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Counter Comparison as Bag-type

I need a bag/multiset-like data type in Python. I understand collections.Counter is often used for this purpose. But the comparison operators don't seem to work:

In [1]: from collections import Counter

In [2]: bag1 = Counter(a=1, b=2, c=3)

In [3]: bag2 = Counter(a=2, b=2)

In [4]: bag1 > bag2
Out[4]: True

This seems like a bug to me. I expected the less-than and greater-than operators to perform set-like subset and superset comparisons. But if that were the case then bag1 > bag2 would be false because bag2 contains an extra 'a'. There also don't seem to be subset/superset methods on Counter objects. So I have two questions:

  1. What comparison logic is used for Counter objects?
  2. How can I compare Counter objects for subset, superset, proper-subset, and proper-superset?
like image 707
William Reed Avatar asked Oct 20 '22 02:10

William Reed


2 Answers

On Python 2, the comparison falls back to the default sort order for dictionaries (Counter is a subclass of dict).

Mappings (dictionaries) compare equal if and only if their sorted (key, value) lists compare equal. [5] Outcomes other than equality are resolved consistently, but are not otherwise defined. [6]

On Python 3, the comparison raises a TypeError:

Mappings (dictionaries) compare equal if and only if they have the same (key, value) pairs. Order comparisons ('<', '<=', '>=', '>') raise TypeError.

like image 76
Steven Rumbalski Avatar answered Nov 03 '22 06:11

Steven Rumbalski


This unanswered question is of interest:

How can I compare Counter objects for subset, superset, proper-subset, and proper-superset?

By defining the missing “rich comparison methods". You could also use free functions instead, which will make client code more explicit.

from collections import Counter

class PartiallyOrderedCounter(Counter):

    def __le__(self, other):
        """ Multiset inclusion """
        return all( v <= other[k] for k,v in self.items() )


    def __lt__(self, other):
        """ Multiset strict inclusion """
        return self <= other and self != other


    # TODO : __ge__ and __gt__
    # Beware : they CANNOT be written in terms of __le__ or __lt__


a = PartiallyOrderedCounter('abc')
b = PartiallyOrderedCounter('ab')
c = PartiallyOrderedCounter('abe')

assert a <= a
assert not a < a    
assert b <= a
assert b < a
assert not a < b    
assert not c <= a
assert not a <= c
like image 29
YvesgereY Avatar answered Nov 03 '22 06:11

YvesgereY