<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"><channel><title>mikeash.com pyblog/friday-qa-2010-06-18-implementing-equality-and-hashing.html comments</title><link>http://www.mikeash.com/?page=pyblog/friday-qa-2010-06-18-implementing-equality-and-hashing.html#comments</link><description>mikeash.com Recent Comments</description><lastBuildDate>Sun, 12 Apr 2026 02:07:22 GMT</lastBuildDate><generator>PyRSS2Gen-1.0.0</generator><docs>http://blogs.law.harvard.edu/tech/rss</docs><item><title>Ilya Kulakov - 2019-05-24 20:46:03</title><link>http://www.mikeash.com/?page=pyblog/friday-qa-2010-06-18-implementing-equality-and-hashing.html#comments</link><description>Sandeep,
&lt;br /&gt;
&lt;br /&gt;In the root class (or as an extension of NSObject) you could add the following method:
&lt;br /&gt;
&lt;br /&gt;&lt;code&gt;
&lt;br /&gt;- (BOOL)isMostSpecializedEqual:(NSObject *)anObject
&lt;br /&gt;{
&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;if (anObject == self)
&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;return YES;
&lt;br /&gt;
&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;NSObject *parent = nil;
&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;NSObject *child = nil;
&lt;br /&gt;
&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;if ([self isKindOfClass:anObject.class])
&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;{
&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;parent = anObject;
&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;child = self;
&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;}
&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;else if ([anObject isKindOfClass:self.class])
&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;{
&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;parent = self;
&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;child = anObject;
&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;}
&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;else
&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;return NO;
&lt;br /&gt;
&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;return [child isEqual:parent];
&lt;br /&gt;}
&lt;br /&gt;&lt;/code&gt;
&lt;br /&gt;
&lt;br /&gt;then implement -isEqual:
&lt;br /&gt;
&lt;br /&gt;&lt;code&gt;
&lt;br /&gt;- (BOOL)isEqual:(id)anObject
&lt;br /&gt;{
&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;if (anObject == self)
&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;return YES;
&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;else if ([self isKindOfClass:anObject.class])
&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;// perform memberwise-equality checks
&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;else
&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;return [self SR_isMostSpecializedEqual:anObject]
&lt;br /&gt;}
&lt;br /&gt;&lt;/code&gt;
&lt;br /&gt;
&lt;br /&gt;This way of isEqual: of the most specialized class in the hierarchy will be consistently used.
&lt;br /&gt;So if at some point one of the subclasses will introduce a state change (e.g. additional property)
&lt;br /&gt;it's own isEqual: will consistently judge.
&lt;br /&gt;</description><guid isPermaLink="true">73167a48fee0f28c31fb2dc1f7f43c44</guid><pubDate>Fri, 24 May 2019 20:46:03 GMT</pubDate></item><item><title>Sandeep Aggarwal - 2016-08-25 11:51:12</title><link>http://www.mikeash.com/?page=pyblog/friday-qa-2010-06-18-implementing-equality-and-hashing.html#comments</link><description>In the paragraph &lt;b&gt;A Note on Subclassing&lt;/b&gt; you wrote at the end &lt;i&gt;Rather than working around it with weird implementations of isEqual:, consider redesigning your class hierarchy.&lt;/i&gt;
&lt;br /&gt;
&lt;br /&gt;What if, I want to keep the similar hierarchy as in this case which is not bad at all  then how to deal with this transitivity problem? </description><guid isPermaLink="true">4b56951a003630739e21f4403d490bef</guid><pubDate>Thu, 25 Aug 2016 11:51:12 GMT</pubDate></item><item><title>Vince - 2014-04-27 14:00:57</title><link>http://www.mikeash.com/?page=pyblog/friday-qa-2010-06-18-implementing-equality-and-hashing.html#comments</link><description>For several years I've had my doubts about how others have explained or justified non-transitivity of equality with respect to sub-classes.  The first paragraph of "A note on subclassing" has finally put that to rest.  Thanks.</description><guid isPermaLink="true">759c7f151f9e3e838d58f1e17e99670b</guid><pubDate>Sun, 27 Apr 2014 14:00:57 GMT</pubDate></item><item><title>Chris - 2013-01-11 07:23:56</title><link>http://www.mikeash.com/?page=pyblog/friday-qa-2010-06-18-implementing-equality-and-hashing.html#comments</link><description>In the case one of the properties is allowed to be nil, this method is incomplete: when you have two objects of class A with a property p that is nil on both, [x.p isEqual:y.p] will return false. I also wrote about this here: &lt;a href="http://chris.eidhof.nl/post/38065014421/implementing-value-objects-in-objective-c"&gt;http://chris.eidhof.nl/post/38065014421/implementing-value-objects-in-objective-c&lt;/a&gt;</description><guid isPermaLink="true">dc6bce99ee01fa07239a4fb79a042ef4</guid><pubDate>Fri, 11 Jan 2013 07:23:56 GMT</pubDate></item><item><title>mikeash - 2010-06-21 16:32:02</title><link>http://www.mikeash.com/?page=pyblog/friday-qa-2010-06-18-implementing-equality-and-hashing.html#comments</link><description>Note that it is impossible to have a hash table with 2^64 distinct values, because you'd run out of address space long beforehand. It's not even possible in theory, as long as your hash value integer is the same size as your pointers.
&lt;br /&gt;
&lt;br /&gt;The problem is not when you &lt;i&gt;actually&lt;/i&gt; have more than 2^64 live values, but when you have more than 2^64 &lt;i&gt;possible&lt;/i&gt; values. Since it's impractical to alter your hash function in the middle of running your program, it's possible to find two objects with the same hash and have to deal with the collision, even when you have far fewer than 2^64 live values.</description><guid isPermaLink="true">1cdbc1f91caed1f98076d0c757c77d08</guid><pubDate>Mon, 21 Jun 2010 16:32:02 GMT</pubDate></item><item><title>Derek - 2010-06-21 14:24:58</title><link>http://www.mikeash.com/?page=pyblog/friday-qa-2010-06-18-implementing-equality-and-hashing.html#comments</link><description>&lt;b&gt;Trevor:&lt;/b&gt;  The hash function only returns a NSUInteger, that means there are at max 64 bits of unique numbers there.  A NSString can have more possible combinations than that (think of a hash table of 100 character strings).  There MUST be some collisions if you have more than 2^64 distinct values stored in a hash table.</description><guid isPermaLink="true">14e25898c3c3b676186025183131e44f</guid><pubDate>Mon, 21 Jun 2010 14:24:58 GMT</pubDate></item><item><title>mikeash - 2010-06-20 14:38:12</title><link>http://www.mikeash.com/?page=pyblog/friday-qa-2010-06-18-implementing-equality-and-hashing.html#comments</link><description>Actually it's "completely" that applies here. You can avoid hash collisions for special cases, but you can't avoid them in all cases all of the time. Your hash table either has to be very special-purpose (as in the case of using a perfect hash because you know all of the keys in advance) or it needs to be able to handle collisions.</description><guid isPermaLink="true">106ce9fd29d4d800d4aa089d4afa5b47</guid><pubDate>Sun, 20 Jun 2010 14:38:12 GMT</pubDate></item><item><title>Trevor - 2010-06-20 05:08:10</title><link>http://www.mikeash.com/?page=pyblog/friday-qa-2010-06-18-implementing-equality-and-hashing.html#comments</link><description>&lt;div class="blogcommentquote"&gt;&lt;div class="blogcommentquoteinner"&gt;it's provably impossible to avoid it completely&lt;/div&gt;&lt;/div&gt;
&lt;br /&gt;
&lt;br /&gt;Not sure what you mean by "impossible". If the keys are known in advance, a perfect hash avoids all collisions:
&lt;br /&gt;
&lt;br /&gt;&lt;a href="http://en.wikipedia.org/wiki/Perfect_hash_function"&gt;http://en.wikipedia.org/wiki/Perfect_hash_function&lt;/a&gt;
&lt;br /&gt;</description><guid isPermaLink="true">1c7c961aef5aa86c5abc955e5a5b4741</guid><pubDate>Sun, 20 Jun 2010 05:08:10 GMT</pubDate></item><item><title>charles - 2010-06-19 18:45:30</title><link>http://www.mikeash.com/?page=pyblog/friday-qa-2010-06-18-implementing-equality-and-hashing.html#comments</link><description>Great point on NSSet: using mutable objects in an NSSet is probably the most likely thing that one could do without realizing (at least for NSDictionary, the key is copied, so even mutable objects will be made immutable). At least I had never thought of that issue.</description><guid isPermaLink="true">8ac4199df889f3795d6f525ef3ab6905</guid><pubDate>Sat, 19 Jun 2010 18:45:30 GMT</pubDate></item><item><title>Erik - 2010-06-19 18:16:59</title><link>http://www.mikeash.com/?page=pyblog/friday-qa-2010-06-18-implementing-equality-and-hashing.html#comments</link><description>You can use mutable objects in an NSSet as long as you implement the hash method such that the value it returns is not based on the mutable properties. An example of this would be if you had mutable objects that have an immutable primary key such as a GUID or sone other identifier. You can even just use the pointer value if instance identity is sufficient. Obviously this is something that you have to decide whether or not is appropriate for your your design; it would allow you to insert another "equivalent" object into the set which may or may not be "correct".</description><guid isPermaLink="true">677293383ce5a3607e701cea59966bba</guid><pubDate>Sat, 19 Jun 2010 18:16:59 GMT</pubDate></item><item><title>mikeash - 2010-06-19 17:58:04</title><link>http://www.mikeash.com/?page=pyblog/friday-qa-2010-06-18-implementing-equality-and-hashing.html#comments</link><description>For holding mutable, mutated objects, use an NSArray. Performance will suffer, but that's probably true of any collection holding changing objects.</description><guid isPermaLink="true">903ee22d1df3c6345832599cda88e63b</guid><pubDate>Sat, 19 Jun 2010 17:58:04 GMT</pubDate></item><item><title>Jeff Johnson - 2010-06-19 17:25:40</title><link>http://www.mikeash.com/?page=pyblog/friday-qa-2010-06-18-implementing-equality-and-hashing.html#comments</link><description>Mike, what about mutable objects in an NSSet?</description><guid isPermaLink="true">fc96115ba8443ed4ef7c55e6e6198481</guid><pubDate>Sat, 19 Jun 2010 17:25:40 GMT</pubDate></item><item><title>mikeash - 2010-06-19 15:00:57</title><link>http://www.mikeash.com/?page=pyblog/friday-qa-2010-06-18-implementing-equality-and-hashing.html#comments</link><description>&lt;b&gt;Frederik:&lt;/b&gt; That is probably a good technique. Overall it can be hard to measure the effectiveness of this stuff. I tend to go with a good balance between ease of coding and whatever looks like it will do a good job of mixing up the bits.
&lt;br /&gt;
&lt;br /&gt;&lt;b&gt;André Pang:&lt;/b&gt; The part that says "or you must make sure the object's internal state information does not change while the object is in the collection" is key. It's usually much easier, and much better, to simply not mutate your mutable keys while they're being used as keys, than to try to figure out how to make their hash not depend on their mutable value. Mutating a key while it's in a collection is probably another code smell.
&lt;br /&gt;
&lt;br /&gt;Regarding Unicode, looks like I broke it when I added comment previews. Should be fixed now.</description><guid isPermaLink="true">71278766d35029732573ef6fb4374517</guid><pubDate>Sat, 19 Jun 2010 15:00:57 GMT</pubDate></item><item><title>Erik Price - 2010-06-19 13:29:01</title><link>http://www.mikeash.com/?page=pyblog/friday-qa-2010-06-18-implementing-equality-and-hashing.html#comments</link><description>Andre, yes - using mutable objects as dictionary keys is the path to madness (if the object can mutate while active as a dict key). That's one reason why people commonly use NSStrings or some other key that reasonably uniquely identifies an object as dictionary keys. </description><guid isPermaLink="true">7433fa9038591915dabb27fe0b27fe01</guid><pubDate>Sat, 19 Jun 2010 13:29:01 GMT</pubDate></item><item><title>Andre Pang - 2010-06-19 08:38:51</title><link>http://www.mikeash.com/?page=pyblog/friday-qa-2010-06-18-implementing-equality-and-hashing.html#comments</link><description>
&lt;br /&gt;Good article as always Mike, though you didn't answer one question that's always bugged me.  The Cocoa docs on -[NSObject hash] says this:
&lt;br /&gt;
&lt;br /&gt;&lt;a href="http://developer.apple.com/mac/library/documentation/cocoa/reference/foundation/Protocols/NSObject_Protocol/Reference/NSObject.html#//apple_ref/occ/intfm/NSObject/hash"&gt;http://developer.apple.com/mac/library/documentation/cocoa/reference/foundation/Protocols/NSObject_Protocol/Reference/NSObject.html#//apple_ref/occ/intfm/NSObject/hash&lt;/a&gt;
&lt;br /&gt;
&lt;br /&gt;"If a mutable object is added to a collection that uses hash values to determine the object's position in the collection, the value returned by the hash method of the object must not change while the object is in the collection. Therefore, either the hash method must not rely on any of the object's internal state information or you must make sure the object's internal state information does not change while the object is in the collection. Thus, for example, a mutable dictionary can be put in a hash table but you must not change it while it is in there. (Note that it can be difficult to know whether or not a given object is in a collection.)"
&lt;br /&gt;
&lt;br /&gt;Well, umm.  If the hash method cannot rely on any of the object's internal state that _may_ change throughout it's lifetime, this means that writing -hash for mutable objects becomes far more intricate.  (I know why the restriction's there; presumably it's one of the reasons why NSDictionary copies its keys rather than simply retaining them.  Dunno about NSMapTable though.)
&lt;br /&gt;
&lt;br /&gt;I took a very braindead approach for RMModelObject's -hash method (sum the pointers for all of the object's ivars) because I felt it was the only safe way to keep within the Cocoa spec that'd still generate somewhat useful hash values.
&lt;br /&gt;
&lt;br /&gt;Thoughts?
&lt;br /&gt;
&lt;br /&gt;P.S. Your commenting doesn't support unicode?  It doesn't like the e-acute in my name :).
&lt;br /&gt;</description><guid isPermaLink="true">fc82a99d642917ede638b5a512ee78b3</guid><pubDate>Sat, 19 Jun 2010 08:38:51 GMT</pubDate></item><item><title>Frederik - 2010-06-19 06:50:11</title><link>http://www.mikeash.com/?page=pyblog/friday-qa-2010-06-18-implementing-equality-and-hashing.html#comments</link><description>Thanks for choosing a great topic, and a great article! I followed the advice at &lt;a href="http://stackoverflow.com/questions/254281/"&gt;http://stackoverflow.com/questions/254281/&lt;/a&gt; to implement my hash methods, using a prime number (31) to multiply each component of the hash before addition. What are your thoughts on that?
&lt;br /&gt;
&lt;br /&gt;Also, adding boolean variables to a hash is needed quite often. The article suggests &lt;code&gt;(var) ? 1231 : 1237&lt;/code&gt; for those.</description><guid isPermaLink="true">fef1c5e5e3abfe570cfaf27837213345</guid><pubDate>Sat, 19 Jun 2010 06:50:11 GMT</pubDate></item><item><title>Adam Maxwell - 2010-06-19 05:02:35</title><link>http://www.mikeash.com/?page=pyblog/friday-qa-2010-06-18-implementing-equality-and-hashing.html#comments</link><description>Nice article, and one that I wish I'd had a few years ago.  Another useful article is at &lt;a href="http://www.mulle-kybernetik.com/artikel/Optimization/opti-7.html"&gt;http://www.mulle-kybernetik.com/artikel/Optimization/opti-7.html&lt;/a&gt; (although slightly dated).  I implemented -hash and -isEqual: improperly once upon a time, and it led to random crashes under very hard-to-reproduce circumstances.  These are definitely worth getting right!  
&lt;br /&gt;
&lt;br /&gt;Prior to Snow Leopard, CFURL was also terrible in a hashing collection because it would end up copying and manipulating strings each time the hash function was called.  It looks like it's computed once and stored inline, now, which should be much better for performance.</description><guid isPermaLink="true">7af1c3a2c795986d002b1136b4549a72</guid><pubDate>Sat, 19 Jun 2010 05:02:35 GMT</pubDate></item><item><title>mikeash - 2010-06-19 03:37:14</title><link>http://www.mikeash.com/?page=pyblog/friday-qa-2010-06-18-implementing-equality-and-hashing.html#comments</link><description>&lt;b&gt;Ahruman:&lt;/b&gt; A fine point. And if you need a better hashing algorithm but don't feel like overriding what Cocoa does universally, you may benefit from custom callbacks in an &lt;code&gt;NSMapTable&lt;/code&gt; or &lt;code&gt;NSHashTable&lt;/code&gt; as detailed in my previous post:
&lt;br /&gt;
&lt;br /&gt;&lt;a href="http://www.mikeash.com/pyblog/friday-qa-2010-05-28-leopard-collection-classes.html"&gt;http://www.mikeash.com/pyblog/friday-qa-2010-05-28-leopard-collection-classes.html&lt;/a&gt;
&lt;br /&gt;
&lt;br /&gt;&lt;b&gt;Anonymous:&lt;/b&gt; In general, my trust of a system is inversely proportional to my knowledge of a system. Some key exceptions are GCD and libc, but even those aren't as solid as I'd like.</description><guid isPermaLink="true">9c282174ade1722891ed40f52aae0689</guid><pubDate>Sat, 19 Jun 2010 03:37:14 GMT</pubDate></item><item><title>Anonymous - 2010-06-18 16:05:25</title><link>http://www.mikeash.com/?page=pyblog/friday-qa-2010-06-18-implementing-equality-and-hashing.html#comments</link><description>Ahruman: this is some fascinating info, and this combined with a growing collection of other observations about Cocoa on Mike's blog is really starting to erode my confidence in the robustness and stability of the Cocoa frameworks... :/</description><guid isPermaLink="true">2df37be1141a1313845289eeee2815dc</guid><pubDate>Fri, 18 Jun 2010 16:05:25 GMT</pubDate></item><item><title>Ahruman - 2010-06-18 15:52:29</title><link>http://www.mikeash.com/?page=pyblog/friday-qa-2010-06-18-implementing-equality-and-hashing.html#comments</link><description>A related pitfall: it's easy to assume that the basic Foundation types will have good, general -hash implementations, but in fact they don't. For the collection types, -hash is equal to -count.
&lt;br /&gt;
&lt;br /&gt;There are good reasons for this, but it's a potentially major pitfall. Concretely, in a recent case that involved uniquing several million short arrays, I reduced runtime from over twelve hours to eleven seconds by noticing this and implementing custom hashing and comparison.
&lt;br /&gt;
&lt;br /&gt;Another fascinatin' anecdote: prior to Snow Leopard, the -hash value of boolean NSNumbers did not match that of integer NSNumbers with value 0 and 1, but -isEqual: claimed they were equal.</description><guid isPermaLink="true">586d512259e880ce66f2bb9b4a12f0db</guid><pubDate>Fri, 18 Jun 2010 15:52:29 GMT</pubDate></item></channel></rss>
