Based on various bad experiences my rule of thumb as a Java programmer is to only implement equals()
and hashCode()
on immutable objects, where two instances of the object really are interchangeable.
Basically I want to avoid situations like the HashMap
key problem in that link, or like the following:
And by and large over the course of my Java career I haven't found a lot of use for equals()
except for (1) value objects and (2) putting things into collections. I've also found that immutability + copy-and-modify constructors/builders is generally a much happier world than setters. Two objects might have the same ID and might represent the same logical entity, but if they have different data -- if they represent snapshots of the conceptual entity at different times -- then they're not equal()
.
Anyway, I'm now in a Hibernate shop, and my more Hibernate-savvy colleagues are telling me this approach isn't going to work. Specifically, the claim seems to be that in the following scenario --
h1
.h4
.-- unless h1.equals(h4)
(or perhaps h4.equals(h1)
, I'm not clear, but I would hope it's transitive anyway so whatever), Hibernate will not be able to tell that these are the same thing, and Bad Things Will Happen.
So, what I want to know:
equals()
for?h1
and h4
to be equal, how does it (and how do we) keep track of which one is the modified version?Note: I've read Implementing equals() and hashCode() in the Hibernate docs and it doesn't deal with the situation I'm worried about, at least directly, nor does it explain in any detail what Hibernate really needs out of equals()
and hashCode()
. Neither does the answer to equals and hashcode in Hibernate, or I wouldn't have bothered to post this.
You only need to override equals() and hashcode() if the entity will be used in a Set (which is very common) AND the entity will be detached from, and subsequently re-attached to, hibernate sessions (which is an uncommon usage of hibernate).
You must override hashCode() in every class that overrides equals(). Failure to do so will result in a violation of the general contract for Object. hashCode(), which will prevent your class from functioning properly in conjunction with all hash-based collections, including HashMap, HashSet, and Hashtable.
Java hashCode() An object hash code value can change in multiple executions of the same application. If two objects are equal according to equals() method, then their hash code must be same. If two objects are unequal according to equals() method, their hash code are not required to be different.
Java recommends to override equals and hashCode method if equality is going to be defined by logical way or via some business logic and many classes in Java standard library does override it e.g. String overrides equals, whose implementation of equals() method return true if the content of two String objects is exactly ...
First of all, your original idea, that you should implement equals() and hashCode() only on immutable objects, certainly works, but it's stricter than it needs to be. You just need these two methods to rely on immutable fields. Any field whose value may change is unsuitable for use in those two methods, but the other fields need not be immutable.
Having said that, Hibernate knows they're the same object by comparing their primary keys. This leads many people to write those two methods to rely on the primary key. Hibernate docs recommend you don't do it this way, but many people ignore this advice without much trouble. It means you can't add entities to a Set until after they've been persisted, which is a restriction that's not too hard to live with.
Hibernate docs recommend using a business key. But the business key should rely on fields that uniquely identify an object. The Hibernate docs say "use a business key that is a combination of unique, typically immutable, attributes." I use fields that have a unique constraint on them in the database. So, if your Sql CREATE TABLE statement specifies a constraint as
CONSTRAINT uc_order_num_item UNIQUE (order_num, order_item)
then those two fields can be your business key. That way, if you change one of them, both Hibernate and Java will treat the modified object as a different object. Of course, if you do change one of these "immutable" fields, you mess up any Set they belong to. So I guess you need to document clearly which fields comprise the business key, and write your application with the understanding that fields in the business key should never be changed for persisted objects. I can see why people ignore the advice and just use the primary key. But you could define the primary key like this:
CONSTRAINT pk_order_num_item PRIMARY KEY (order_num, order_item)
And you would still have the same problem.
Personally, I would like to see an annotation that specifies every field in the business key, and have an IDE inspection that checks if I modify it for persisted objects. Maybe that's asking too much.
Another approach, one that solves all of these problems, is to use a UUID for the primary key, which you generate on the client when you first construct an unpersisted entity. Since you never need to show it to the user, your code is not likely to change its value once you set it. This lets you write hashCode() and equals() methods that always work, and remain consistent with each other.
One more thing: If you want to avoid the problem of adding an object to a Set that already contains a different (modified) version of it, the only way is to always ask the set if it's already there before adding it. Then you can write code to handle that special case.
What semantics does JPA/Hibernate impose?
The JPA specification says the following.
2.4 Primary Keys and Entity Identity
Every entity must have a primary key. ... The value of its primary key uniquely identifies an entity instance within a persistence context and to
EntityManager
operations
I interpret that as saying the semantics of equivalence for JPA entities is equivalence of primary keys. That suggests the equals()
method should compare the primary keys for equivalence, and nothing else.
But the Hibernate advice you reference (and another article I've seen) say not to do that, but rather to use a "business key" rather than the primary key. The reason for this seems to be because we can not guarantee that an entity object has a value for a generated primary key until the entity has been synchronized (using EntityManager.flush()
) to the data-base.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With