How (and why) to cook equals() and hashCode() in Java

This collection uses hashCode() and equals() method to proceed put() and get() operations.

Warning: as HashMap is not the main topic of this article, I’ll provide you the simplified version of the processes under the hood, as for now only the main conceptions are important — so, if you are interested in the true flow, please, let me know, and I’ll create a separate post about HashMap.

HashMap contains several buckets (the number changes if the size is reached), that could be used to store the key-value elements (such pair called Entry) in them.

The bucket could contain zero, one or several elements.

If there are several elements, it is called a collision, and the elements are stored in a LinkedList (it is only partly true, but, again, we are focusing on the concept right now) inside the bucket.

So, when the put(Key key, Value value) method is called, the following steps are performed under the hood:If key is null, the key-value pair (Entry) is put into the 1st bucket;If not null, the hashCode() method is called on the key;HashMap takes the key hash value, processes it through the internal computations to get the number of the bucket to be used;If the chosen bucket is empty, the Entry is put there, and that’s it;If the bucket is not empty, it iterates through all the existing entries inside the bucket and compares their keys with the key from the put() method using equals();If equals() returns false for each iteration, the Entry will be stored in the current bucket;If equals() returns true for some case, so the Entry has the same key as from the put() method, this Entry value will be replaced by the new value (argument from the put() method).

Game over.

Now we can figure out the first possible issue with our hard-coded hashCode() implementation: each not-null instance of the Person class will return 42 on the 2nd step, so all these instances are put to the same bucket, as the 3rd step computations produce the same output for the same input.

We will face the collision.

What’s wrong with the collision?.Let me show you while explaining the mechanism of get(Key key) method steps:If key is null, we go to the 1st bucket and look for the Entry with the key == null — if exists, its value is returned;If key is not null, the hashCode() method is called on the key;HashMap takes the key hash value, processes it through the internal computations to get the number of the bucket to be used;We go to the bucket with the number providing on the 3rd step;If the bucket is empty — return null;If that bucket contains only one Entry, we compare (equals()) that Entry key with our key: if true — return the Entry value, if false — return null;If the bucket contains several elements, we iterate through each of them and compare (equals()) the key of each Entry with our key: if true — return matching Entry value, if false for each — return null.

In our hardcoded hashCode() case, we’ll go through the 7th step scenario.

 As you might have noticed, there is one important difference between scenarios from the steps 5/6 and the 7th one: the 5th and 6th contains no collision, so it takes only one operation (key.

equals(entry.

getKey()) to find the value, but on the 7th step due to the collision, we have to do those operation (key.

equals(entry.

getKey()) several times (in the worst casen times, where n is the quantity of the entries inside the bucket).

 So, the collisions destroy the main feature of HashMap – extremely fast get-operations (one operation O(1) vs n operations O(n)).

Another possible issue in HashMap could happen if the hashCode() returns new hash value on each call (or isn’t consistent with the equals() method).

In that case, the 2nd step will return new int data each time, so the 3rd step leads us to the wrong bucket and we will never find the needed Entry — the data will be lost, that is an extremely bad issue nowadays (imagine that your salary payment is lost, because the bank developer forgot to implement hashCode() method properly for the Payment class).

So, you have just seen, that the equals() and hashCode() methods work together inside the HashMap to get and put the data: hashCode() is used to compute the bucket number and equals() is used to find the Entry with the same key.

Now we know more than enough to implement the hashCode() method finally.

Hint: to be compliant with the contract between the hashCode() and equals() methods, it is considered to be a good practice to use the same fields in these 2 methods.

So, as we decided to use idNumber for comparison purposes, we could use it inside the hashCode() method also.

But, as you remember, idNumber is long type, but hashCode()returns int.

Piece of cake: we’ll use Objects.

hash() for this:Even if idNumber is int (or any other data type), it is a good practice to hash it using the mentioned method, as the hash algorithm will take care of providing such a data, that will be more appropriate to omitting the collisions.

Now, our Person class final version looks so:Good job, folk!Let me summarize all the stuff of this article:equals() method:Is used to compare 2 non-primitive objects;Rules of the method:- reflexive: a.

equals(a) must always return true;- symmetric: if a.

equals(b) returns true, then b.

equals(a) must always return true;- transitive: if a.

equals(b) returns true and a.

equals(c) returns true, it means, that b.

equals(c) must be true;- consistent: if a.

equals(b) returns true (or false) and neither a nor b has been changed, a.

equals(b) should always return the same result.

Steps to implement: 1) compare objects using == operator; 2) check, whether the Object o parameter is not null.

If yes, return false; 3) check, whether the Object o parameter has the same type, as our instance.

If no, return false; 4) cast the Object o to the current class type; 5) compare the chosen fields values.

hashCode() methods:Is used for the hashing purposes — to provide the int value for the object;Rules of the method:- each invocation of hashCode() method on the same object that hasn’t been changed must produce the same result each time; – if 2 objects are equal through the equals() method, then invoking the hashCode() method on each of these 2 objects must produce the same result; – but if invoking the hashCode() method on 2 objects produces the same result, it doesn’t mean that they are equal (through the equals() method).

The last 2 rules are also known as a contract between equals() and hashCode() methods.

HashMap and equals()/hashCode() method usage:HashMap is the collection for storing key-value pair elements, that uses hashing under the hood to provide extremely fast get operations;The elements of the HashMap are stored into buckets;hashCode() results are used to compute the number of the bucket, where the element will be stored;equals() is used to compare elements keys to find the needed element;If the hashCode() is implemented in a hardcoded way, all the instances of the same class will be stored in one bucket, that will lead to the performance issues (n operations instead of 1 for the get process);Also, the element could be lost, if the hashCode() returns new value each time or isn’t compliant with the contract between equals() and hashCode().

It was a long journey, but we have done it together.

I hope you have learned something new and didn’t fall asleep during the reading.

Anyway, thanks and see you in the comment section and in the future posts!Have fun!.

. More details

Leave a Reply