Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: Is this an ok way of overriding __eq__ and __hash__?

Tags:

I'm new to Python, and I wanted to make sure that I overrode __eq__ and __hash__ correctly, so as not to cause painful errors later:

(I'm using Google App Engine.)

class Course(db.Model):     dept_code = db.StringProperty()     number = db.IntegerProperty()     title = db.StringProperty()     raw_pre_reqs = db.StringProperty(multiline=True)     original_description = db.StringProperty()      def getPreReqs(self):         return pickle.loads(str(self.raw_pre_reqs))      def __repr__(self):         title_msg = self.title if self.title else "Untitled"         return "%s %s: %s" % (self.dept_code, self.number, title_msg)      def __attrs(self):         return (self.dept_code, self.number, self.title, self.raw_pre_reqs, self.original_description)      def __eq__(self, other):         return isinstance(other, Course) and self.__attrs() == other.__attrs()      def __hash__(self):         return hash(self.__attrs()) 

A slightly more complicated type:

class DependencyArcTail(db.Model):     ''' A list of courses that is a pre-req for something else '''     courses = db.ListProperty(db.Key)      ''' a list of heads that reference this one '''     forwardLinks = db.ListProperty(db.Key)      def __repr__(self):         return "DepArcTail %d: courses='%s' forwardLinks='%s'" % (id(self), getReprOfKeys(self.courses), getIdOfKeys(self.forwardLinks))      def __eq__(self, other):         if not isinstance(other, DependencyArcTail):             return False          for this_course in self.courses:             if not (this_course in other.courses):                 return False          for other_course in other.courses:             if not (other_course in self.courses):                 return False          return True      def __hash__(self):         return hash((tuple(self.courses), tuple(self.forwardLinks))) 

Everything look good?

Updated to reflect @Alex's comments

class DependencyArcTail(db.Model):     ''' A list of courses that is a pre-req for something else '''     courses = db.ListProperty(db.Key)      ''' a list of heads that reference this one '''     forwardLinks = db.ListProperty(db.Key)      def __repr__(self):         return "DepArcTail %d: courses='%s' forwardLinks='%s'" % (id(self), getReprOfKeys(self.courses), getIdOfKeys(self.forwardLinks))      def __eq__(self, other):         return isinstance(other, DependencyArcTail) and set(self.courses) == set(other.courses) and set(self.forwardLinks) == set(other.forwardLinks)      def __hash__(self):         return hash((tuple(self.courses), tuple(self.forwardLinks))) 
like image 491
Nick Heiner Avatar asked Jun 19 '10 19:06

Nick Heiner


People also ask

How do you override a hash in Python?

Introduction to the Python hash function By default, the __hash__ uses the object's identity and the __eq__ returns True if two objects are the same. To override this default behavior, you can implement the __eq__ and __hash__ . If a class overrides the __eq__ method, the objects of the class become unhashable.

Are Python sets hashed?

All of Python's immutable built-in objects are hashable, while no mutable containers (such as lists or dictionaries) are. Objects which are instances of user-defined classes are hashable by default; they all compare unequal (except with themselves), and their hash value is derived from their id().

What is hash method in Python?

What is Hash Method in Python? Hash method in Python is a module that is used to return the hash value of an object. In programming, the hash method is used to return integer values that are used to compare dictionary keys using a dictionary look up feature.


1 Answers

The first one is fine. The second one is problematic for two reasons:

  1. there might be duplicates in .courses
  2. two entities with identical .courses but different .forwardLinks would compare equal but have different hashes

I would fix the second one by making equality depend on both courses and forward links, but both changes to sets (hence no duplicates), and the same for hashing. I.e.:

def __eq__(self, other):     if not isinstance(other, DependencyArcTail):         return False      return (set(self.courses) == set(other.courses) and             set(self.forwardLinks) == set(other.forwardLinks))  def __hash__(self):     return hash((frozenset(self.courses), frozenset(self.forwardLinks))) 

This of course is assuming that the forward links are crucial to an object's "real value", otherwise they should be omitted from both __eq__ and __hash__.

Edit: removed from __hash__ calls to tuple which were at best redundant (and possibly damaging, as suggested by a comment by @Mark [[tx!!!]]); changed set to frozenset in the hashing, as suggested by a comment by @Phillips [[tx!!!]].

like image 154
Alex Martelli Avatar answered Sep 20 '22 13:09

Alex Martelli