Joshua Bloch says on Effective Java
You must override hashCode() in every class that overrides equals(). Failure to do so will result in a violation of the general contract for Object.hashCode(), which will prevent your class from functioning properly in conjunction with all hash-based collections, including HashMap, HashSet, and Hashtable.
Let's try to understand it with an example of what would happen if we override equals() without overriding hashCode() and attempt to use a Map.
Say we have a class like this and that two objects of MyClass are equal if their importantField is equal (with hashCode() and equals() generated by eclipse)
public class MyClass {
private final String importantField;
private final String anotherField;
public MyClass(final String equalField, final String anotherField) {
this.importantField = equalField;
this.anotherField = anotherField;
}
@Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result
+ ((importantField == null) ? 0 : importantField.hashCode());
return result;
}
@Override
public boolean equals(final Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
final MyClass other = (MyClass) obj;
if (importantField == null) {
if (other.importantField != null)
return false;
} else if (!importantField.equals(other.importantField))
return false;
return true;
}
}
Imagine you have this
MyClass first = new MyClass("a","first");
MyClass second = new MyClass("a","second");
Override only equals
If only equals is overriden, then when you call myMap.put(first,someValue) first will hash to some bucket and when you call myMap.put(second,someOtherValue) it will hash to some other bucket (as they have a different hashCode). So, although they are equal, as they don't hash to the same bucket, the map can't realize it and both of them stay in the map.
Although it is not necessary to override equals() if we override hashCode(), let's see what would happen in this particular case where we know that two objects of MyClass are equal if their importantField is equal but we do not override equals().
Override only hashCode
If you only override hashCode then when you call myMap.put(first,someValue) it takes first, calculates its hashCode and stores it in a given bucket. Then when you call myMap.put(second,someOtherValue) it should replace first with second as per the Map Documentation because they are equal (according to the business requirement).
But the problem is that equals was not redefined, so when the map hashes second and iterates through the bucket looking if there is an object k such that second.equals(k) is true it won't find any as second.equals(first) will be false.
Hope it was clear
Answer from Lombo on Stack OverflowWhy do I need to override the equals and hashCode methods in Java? - Stack Overflow
Origin of misconception for equals and hashcode implementation
Java equals() and hashCode()
java - Do we always need to override equals/hashcode when creating a new class? - Software Engineering Stack Exchange
Videos
Joshua Bloch says on Effective Java
You must override hashCode() in every class that overrides equals(). Failure to do so will result in a violation of the general contract for Object.hashCode(), which will prevent your class from functioning properly in conjunction with all hash-based collections, including HashMap, HashSet, and Hashtable.
Let's try to understand it with an example of what would happen if we override equals() without overriding hashCode() and attempt to use a Map.
Say we have a class like this and that two objects of MyClass are equal if their importantField is equal (with hashCode() and equals() generated by eclipse)
public class MyClass {
private final String importantField;
private final String anotherField;
public MyClass(final String equalField, final String anotherField) {
this.importantField = equalField;
this.anotherField = anotherField;
}
@Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result
+ ((importantField == null) ? 0 : importantField.hashCode());
return result;
}
@Override
public boolean equals(final Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
final MyClass other = (MyClass) obj;
if (importantField == null) {
if (other.importantField != null)
return false;
} else if (!importantField.equals(other.importantField))
return false;
return true;
}
}
Imagine you have this
MyClass first = new MyClass("a","first");
MyClass second = new MyClass("a","second");
Override only equals
If only equals is overriden, then when you call myMap.put(first,someValue) first will hash to some bucket and when you call myMap.put(second,someOtherValue) it will hash to some other bucket (as they have a different hashCode). So, although they are equal, as they don't hash to the same bucket, the map can't realize it and both of them stay in the map.
Although it is not necessary to override equals() if we override hashCode(), let's see what would happen in this particular case where we know that two objects of MyClass are equal if their importantField is equal but we do not override equals().
Override only hashCode
If you only override hashCode then when you call myMap.put(first,someValue) it takes first, calculates its hashCode and stores it in a given bucket. Then when you call myMap.put(second,someOtherValue) it should replace first with second as per the Map Documentation because they are equal (according to the business requirement).
But the problem is that equals was not redefined, so when the map hashes second and iterates through the bucket looking if there is an object k such that second.equals(k) is true it won't find any as second.equals(first) will be false.
Hope it was clear
Collections such as HashMap and HashSet use a hashcode value of an object to determine how it should be stored inside a collection, and the hashcode is used again in order to locate the object
in its collection.
Hashing retrieval is a two-step process:
- Find the right bucket (using
hashCode()) - Search the bucket for the right element (using
equals())
Here is a small example on why we should overrride equals() and hashcode().
Consider an Employee class which has two fields: age and name.
public class Employee {
String name;
int age;
public Employee(String name, int age) {
this.name = name;
this.age = age;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public int getAge() {
return age;
}
public void setAge(int age) {
this.age = age;
}
@Override
public boolean equals(Object obj) {
if (obj == this)
return true;
if (!(obj instanceof Employee))
return false;
Employee employee = (Employee) obj;
return employee.getAge() == this.getAge()
&& employee.getName() == this.getName();
}
// commented
/* @Override
public int hashCode() {
int result=17;
result=31*result+age;
result=31*result+(name!=null ? name.hashCode():0);
return result;
}
*/
}
Now create a class, insert Employee object into a HashSet and test whether that object is present or not.
public class ClientTest {
public static void main(String[] args) {
Employee employee = new Employee("rajeev", 24);
Employee employee1 = new Employee("rajeev", 25);
Employee employee2 = new Employee("rajeev", 24);
HashSet<Employee> employees = new HashSet<Employee>();
employees.add(employee);
System.out.println(employees.contains(employee2));
System.out.println("employee.hashCode(): " + employee.hashCode()
+ " employee2.hashCode():" + employee2.hashCode());
}
}
It will print the following:
false
employee.hashCode(): 321755204 employee2.hashCode():375890482
Now uncomment hashcode() method , execute the same and the output would be:
true
employee.hashCode(): -938387308 employee2.hashCode():-938387308
Now can you see why if two objects are considered equal, their hashcodes must
also be equal? Otherwise, you'd never be able to find the object since the default
hashcode method in class Object virtually always comes up with a unique number
for each object, even if the equals() method is overridden in such a way that two
or more objects are considered equal. It doesn't matter how equal the objects are if
their hashcodes don't reflect that. So one more time: If two objects are equal, their
hashcodes must be equal as well.
So i do some coding interviews at my company and one question i ask nearly every Person who applies for a Java developer job is what are the equals and hashcode methods. I just want to see if the person understands what they do, how you could implement both, what contract is between them and where there are pitfalls, like on jpa entities.
It is not about how to generating those methods. I want to see if they thought about the concept.
One thing, many people say is that it is a good idea to use hashcode to implement equals. So that you should just compare the hashes.
I think it is clear that this is no good solution. But it is interesting that so many people say just that. And i think when i learned Java, i read this for myself. But i am not sure where.
Has anyone an idea where this misconception comes from? Maybe it was thought in a book or so?
should we always override the equals and hashCode even if we don’t intent at that point to use the class with any Collection classes?
No and I would go even further as to say you probably don't need to override them even if you are going to put them into a collection. The default implementations are perfectly compatible with collections as is. For lists, it's irrelevant anyway.
You should override equals if and only if your objects have a logical identity that is independent of their physical identity. In other words, if you want to have multiple objects that represent the same entity. Integer, is a good example of this. The integer 2 is always 2 regardless of whether there are 100 instances or 1 of the Integer object with the value of 2.
You must override hashcode if you've overridden equals.
That's it. There are no other good reasons to modify these.
As a side note, the use of equals to implement algorithms is highly overused. I would only do this if your object has a true logical identity. In most cases, whether two objects represent the same thing is highly context dependent and it's easier and better to use a property of the object explicitly and leave equals alone. There are many pitfalls to overriding equals, especially if you are using inheritance.
An example:
Let's say you have an online store and you are writing a component that fulfills orders. You decide to override the equals() method of the Order object to be based on the order number. As far as you know, that's the it's identity representation. You query a DB every so often and create objects from the response. You take each Order object and keep it as a key in a set which is all the orders in process. But there's a problem, orders can be modified in a certain time frame. You now might get a response from the DB that contains the same Order number but with different properties.
You can't just write this object to your set because you'll lose track of what was in process. You might end up processing the same order twice. So, what are the options? You could update the equals() method to include a version number or something additional. But now you have to go through your code to figure out where you need to base the logic on the having the same order number and where you need it to be based on the new object identity. In other words, there's not just one answer to whether two objects represent the same thing. In some contexts it's the order, and in some contexts it's the order and the version.
Instead of the set, you should build a map and use the order number as the key. Now you can check for the existence of the key in the map and retrieve the other object for comparison. This is much more straightforward and easy to understand than trying to make sense of when the equals method works they way you need it to for different needs.
A good example of this kind of complexity can be found in the BigDecimal class. You might think that BigDecimal("2.0") and BigDecimal("2.00") are equal. But they are not.
Conclusion:
It can make sense to implement equals() (make sure you update hashcode() too if you do) and doing so doesn't prevent the use of any of the techniques I describe here.
But if you are doing this:
- Make sure your object is immutable
- Use every meaningful property of the object to define equals.
- If you are using inheritance, either:
- make equals final on the base class OR
- you must check that the actual type matches in subclasses
If any of these aren't followed, you are in for trouble.
In Java, if you don’t override anything, equality is defined as object references being equal, and the hash code is the bit pattern of the reference. That works just fine to put objects into containers.
And there are objects where this is exactly right. For example an object representing a window in the UI. Two windows are equal if and only if they are the same window.
PS. It makes absolutely sense to have a set of windows, or a dictionary with windows as keys, or an array of windows and lookup if the array contains some window - so you need to be able to compare windows for equality.