custom comparison for built-in containers

Question

In my code there's numerous comparisons for equality of various containers (list, dict, etc.). The keys and values of the containers are of types float, bool, int, and str. The built-in == and != worked perfectly fine.

I just learned that the floats used in the values of the containers must be compared using a custom comparison function. I've written that function already (let's call it approxEqual(), and assume that it takes two floats and return True if they are judged to be equal and False otherwise).

I prefer that the changes to the existing code are kept to a minimum. (New classes/functions/etc can be as complicated as necessary.)

Example:

if dict1 != dict2:
  raise DataMismatch

The dict1 != dict2 condition needs to be rewritten so that any floats used in values of dict1 and dict2 are compared using approxEqual function instead of __eq__.

The actual contents of dictionaries comes from various sources (parsing files, calculations, etc.).

Note: I asked a question earlier about how to override built-in float's eq. That would have been an easy solution, but I learned that Python doesn't allow overriding built-in types' __eq__ operator. Hence this new question.

Alex Martelli · Accepted Answer

The only route to altering the way built-in containers check equality is to make them contain as values, instead of the "originals", wrapped values (wrapped in a class that overrides __eq__ and __ne__). This is if you need to alter the way the containers themselves use equality checking, e.g. for the purpose of the in operator where the right-hand side operand is a list -- as well as in containers' method such as their own __eq__ (type(x).__eq__(y) is the typical way Python will perform internally what you code as x == y).

If what you're talking about is performing your own equality checks (without altering the checks performed internally by the containers themselves), then the only way is to change every cont1 == cont2 into (e.g.) same(cont1, cont2, value_same) where value_same is a function accepting two values and returning True or False like == would. That's probably too invasive WRT the criterion you specify.

If you can change the container themselves (i.e., the number of places where container objects are created is much smaller than the number of places where two containers are checked for equality), then using a container subclass which overrides __eq__ is best.

E.g.:

class EqMixin(object):
  def __eq__(self, other):
    return same(cont1, cont2, value_same)

(with same being as I mentioned in the A's 2nd paragraph) and

class EqM_list(EqMixin, list): pass

(and so forth for other container types you need), then wherever you have (e.g.)

x = list(someiter)

change it into

x = EqM_list(someiter)

and be sure to also catch other ways to create list objects, e.g. replace

x = [bah*2 for bah in buh]

with

x = EqM_list(bah*2 for bah in buh)

and

x = d.keys()

with

x = EqM_list(d.iterkeys())

and so forth.

Yeah, I know, what a bother -- but it's a core principle (and practice;-) of Python that builtin types (be they containers, or value types like e.g. float) themselves cannot be changed. That's a very different philosophy from e.g. Ruby's and Javascript's (and I personally prefer it but I do see how it can seem limiting at times!).

Edit: the OP specific request seems to be (in terms of this answer) "how do I implement same" for the various container types, not how to apply it without changing the == into a function call. If that's correct, then (e.g) without using iterators for simplicity:

def samelist(a, b, samevalue):
    if len(a) != len(b): return False
    return all(samevalue(x, y) for x, y in zip(a, b))

def samedict(a, b, samevalue):
    if set(a) != set(b): return False
    return all(samevalue(a[x], b[x]) for x in a))

Note that this applies to values, as requested, NOT to keys. "Fuzzying up" the equality comparison of a dict's keys (or a set's members) is a REAL problem. Look at it this way: first, how to you guarantee with absolute certainty that samevalue(a, b) and samevalue(b, c) totally implies and ensures samevalue(a, c)? This transitivity condition does not apply to most semi-sensible "fuzzy comparisons" I've ever seen, and yet it's completely indispensable for the hash-table based containers (such as dicts and sets). If you pass that hurdle, then the nightmare of making the hash values somehow "magically" consistent arises -- and what if two actually different keys in one dict "map to" equality in this sense with the same key in the other dict, which of the two corresponding values should be used then...? This way madness lies, if you ask me, so I hope that when you say values you do mean, exactly, values, and not keys!-)

custom comparison for built-in containers

Tags:

python

comparison

max

1 Answers

Alex Martelli

Recent Activity

Donate For Us

custom comparison for built-in containers

Tags:

python

comparison

max

1 Answers

Alex Martelli

Related questions

Recent Activity

Donate For Us