Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does NSSet use hash to define uniqueness?

I've been working under the assumption that NSSet used hash to look up potential matches, and then called isEqual on each of those to check for real collisions, but I realized that I can't find any evidence to back this up.

The reason I bring it up is the existence of the "member:" method in NSSet. Why does the documentation for member: go out of its way to specify that isEqual: is used to find your object when nothing else in NSSet does? Does containsObject: only use the hash or something?

Can anyone confirm this behavior? And ideally, reference documentation on it?

like image 241
DougW Avatar asked Mar 02 '11 01:03

DougW


2 Answers

I would suggest reading Collections Programming Topics, specifically the 'Sets: Unordered Collections of Objects' section. In there you will find the following information:

This performance information assumes adequate implementations for the hash method defined for the objects. With a bad hash function, access and edits take linear time.

and

The objects in a set must respond to the NSObject protocol methods hash and isEqual: (see NSObject for more information). If mutable objects are stored in a set, either the hash method of the objects shouldn’t depend on the internal state of the mutable objects or the mutable objects shouldn’t be modified while they’re in the set. For example, a mutable dictionary can be put in a set, but you must not change it while it is in there. (Note that it can be difficult to know whether or not a given object is in a collection).

So, yes, hash and isEqual are used as you had assumed.

like image 84
ericg Avatar answered Nov 08 '22 21:11

ericg


I'd like to step in and give further information about NSSet's use of hash and isEqual:, since I have encountered weird bugs recently and found that they were due to me overlooking hash.

When you store custom objects in a set, NSSet uses the value returned by your hash method to group objects in distinct bins, in which they're compared to one another using their respective isEqual: method. So basically, hash will always be called on your object when inserting, building, testing membership and so on, and if this object falls into a bin where there are other objects, its isEqual method will be used to distinguish it from them.

That's why hash must always be the same for objects considered equal, and as much as possible, yield values that will spread objects evenly. The latter property ensures the bins are as small as possible, minimizing calls to isEqual:.

like image 15
matehat Avatar answered Nov 08 '22 20:11

matehat