Case Study: Hashmap Performance Improvement in Java 8

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

Case

Study: HashMap in Java


www.ritambhara.in

Case Study: HashMap performance improvement in Java 8

Java language’s utils package has HasMap class that implements Map interface using a hash table. It
implements two constant-time functions put and get that stores and retrieves values in HashMap.
HashMap stores information in Key-Value pair as demonstrated in Code 3.1.
import java.util.*;
public class RTDemo
{
public static void main(String args[])
{
HashMap<Integer,String> hash = new HashMap<Integer,String>();

hash.put(1000, "Ritambhara");
hash.put(2000, "Moksha");
hash.put(3000, "Radha");

System.out.println("Value for 1000 is: " + hash.get(1000));


System.out.println("Value for 2000 is: " + hash.get(2000));
System.out.println("Value for 3000 is: " + hash.get(3000));
}
}

Code: 3.1

The output of Code 3.1 is


Value for 1000 is: Ritambhara
Value for 2000 is: Moksha
Value for 3000 is: Radha

HashMap<Integer,String> declares a HashMap with String values stored on Integer keys.


Obviously, the implementation cannot keep an array of size 3000 to store these values. The key is passed
to a hash function that generates a smaller value indicating index in the actual hash table, something like.
Index = hashFunction(1000);

let us assume that size of hash table is two1 and hashFunction2 is as below
int hashFunction(int n)
{
return n%2;
}

Index for 1000, 2000 and 3000 are 1, 2 and 1 respectively. Both 1000 and 3000 generate same index.
This is called Collision in hashing. There are multiple ways to handle collision, one of them is thru
chaining.

Hash table holds a pointer to the head of linked list of records having same hash value.

1
Size of hash table is usually higher so that each bucket holds (preferably) single entry.

2 HashMap in Java uses hashCode() and equals() method of key.

© Ritambhara Technologies. +91-8377803450 [email protected]


Case Study: HashMap in Java
www.ritambhara.in

Calling get function with key 3000, retrieves the value in two steps:

1. Find the index of hash table that stores key 3000 using same hash function. This value comes out
to be 1.
2. Traverse the list of index 1 in the hash table to find record corresponding to key 3000.

In worst case, all keys correspond to the same index. Searching in such a hash table takes O(n) time. But
this is an almost impossible situation for a good hash table implementation. However, there are possibilities
that individual linked list has many values.

As you can see each collision has a direct impact on performance. Since searching in a linked list is linear,
worst-case time is O(k), where k is number of nodes in the linked list. What java-8 has done is, when a
linked list has more nodes, they replace it with a balanced binary tree dynamically3 improving the search
time from O(k) to O(lg(k)).

This simple improvement makes HashMap.get()function call of Java 8, on an average, 20% faster than
Java 7.

3
Also see: http://openjdk.java.net/jeps/180

© Ritambhara Technologies. +91-8377803450 [email protected]

You might also like