Hashing
Hashing
Hashing
Hash Table
Hash table is one of the most important data structures that uses a special function known as
a hash function that maps a given value with a key to access the elements faster.
A Hash table is a data structure that stores some information, and the information has two
main components, i.e., key and value. The hash table can be implemented with the help of an
associative array. The efficiency of mapping depends upon the efficiency of the hash function
used for mapping.
For example, suppose the key value is John and the value is the phone number, so when we
pass the key value in the hash function shown as below:
Hash(key)= index;
When we pass the key in the hash function, then it gives the index.
Hash(john) = 3;
The above example adds the john at the index 3.
The main idea behind the hashing is to create the (key/value) pairs. If
the key is given, then the algorithm computes the index at which the
value would be stored. It can be written as:
Index = hash(key)
Hash Function
Hash Function is used to index the original value or key and then used later
each time the data associated with the value or key is to be retrieved. Thus,
hashing is always a one-way operation. There is no need to "reverse
engineer" the hash function by analyzing the hashed values.
1) Division method
2) Multiplication Method
3) Mid square method
4) Folding method
Collision
When the two different values have the same value, then the problem
occurs between the two values, known as a collision.
In the above example, the value is stored at index 6. If the key value is
26, then the index would be:
h(26) = 26%10 = 6
Therefore, two values are stored at the same index, i.e., 6, and this
leads to the collision problem. To resolve these collisions, we have
some techniques known as collision resolution techniques.
The value 11 would be stored at the index 5. Now, we have two values (6, 11) stored at the
same index, i.e., 5. This leads to the collision problem, so we will use the chaining method to
avoid the collision. We will create one more list and add the value 11 to this list. After the
creation of the new list, the newly created list will be linked to the list having value 6.
The value 7 would be stored at index 7. Now, we have two values (2, 7) stored at the same
index, i.e., 7. This leads to the collision problem, so we will use the chaining method to
avoid the collision. We will create one more list and add the value 7 to this list. After the
creation of the new list, the newly created list will be linked to the list having value 2.
a) Linear probing
b) Quadratic probing
c) Double Hashing technique
Linear Probing
Linear probing is one of the forms of open addressing. As we know
that each cell in the hash table contains a key-value pair, so when the
collision occurs by mapping a new key to the cell already occupied by
another key, then linear probing technique searches for the closest
free locations and adds a new key to that empty cell. In this case,
searching is performed sequentially, starting from the position where
the collision occurs till the empty cell is not found.
Let's understand the linear probing through an example.
The key values 3, 2, 9, 6 are stored at the indexes 9, 7, 1, 5 respectively. The calculated
index value of 11 is 5 which is already occupied by another key value, i.e., 6. When
linear probing is applied, the nearest empty cell to the index 5 is 6; therefore, the value
11 will be added at the index 6.
The next key value is 13. The index value associated with this key value is 9 when hash
function is applied. The cell is already filled at index 9. When linear probing is applied,
the nearest empty cell to the index 9 is 0; therefore, the value 13 will be added at the
index 0.
The next key value is 7. The index value associated with the key value is
7 when hash function is applied. The cell is already filled at index 7.
When linear probing is applied, the nearest empty cell to the index 7 is
8; therefore, the value 7 will be added at the index 8.
The next key value is 12. The index value associated with the key value
is 7 when hash function is applied. The cell is already filled at index 7.
When linear probing is applied, the nearest empty cell to the index 7 is
2; therefore, the value 12 will be added at the index 2.
Double Hashing
In double hashing, two hash functions are used. Suppose h1(k) is one of
the hash functions used to calculate the locations whereas h2(k) is
another hash function. It can be defined as "insert ki at first free place
from (u+v*i)%m where i=(0 to m-1)". In this case, u is the location
computed using the hash function and v is equal to (h2(k)%m).
Consider the same example that we use in quadratic probing.
A = 3, 2, 9, 6, 11, 13, 7, 12 where m = 10, and
h1(k) = 2k+3
h2(k) = 3k+1
key
Location (u)
v
probe
3
((2*3)+3)%10 = 9
-
1
2
((2*2)+3)%10 = 7
-
1
9
((2*9)+3)%10 = 1
-
1
6
((2*6)+3)%10 = 5
-
1
11
((2*11)+3)%10 = 5
(3(11)+1)%10 =4
3
13
((2*13)+3)%10 = 9
(3(13)+1)%10 = 0
7
((2*7)+3)%10 = 7
(3(7)+1)%10 = 2
12
((2*12)+3)%10 = 7
(3(12)+1)%10 = 7
4
As we know that no collision would occur while inserting the keys (3, 2,
9, 6), so we will not apply double hashing on these key values.
On inserting the key 11 in a hash table, collision will occur because the
calculated index value of 11 is 5 which is already occupied by some
another value. Therefore, we will apply the double hashing technique
on key 11. When the key value is 11, the value of v is 4.
Now, substituting the values of u and v in (u+v*i)%m
When i=0
Index = (5+4*0)%10 =5
When i=1
Index = (5+4*1)%10 = 9
When i=2
Index = (5+4*2)%10 = 3
Since the location 3 is empty in a hash table; therefore, the key 11 is added at
the index 3.
The next element is 13. The calculated index value of 13 is 9 which is already
occupied by some another key value. So, we will use double hashing technique
to find the free location. The value of v is 0.
Now, substituting the values of u and v in (u+v*i)%m
When i=0
Index = (9+0*0)%10 = 9
We will get 9 value in all the iterations from 0 to m-1 as the value of v is zero.
Therefore, we cannot insert 13 into a hash table.
The next element is 7. The calculated index value of 7 is 7 which is already occupied by
some another key value. So, we will use double hashing technique to find the free
location. The value of v is 2.
Now, substituting the values of u and v in (u+v*i)%m
When i=0
Index = (7 + 2*0)%10 = 7
When i=1
Index = (7+2*1)%10 = 9
When i=2
Index = (7+2*2)%10 = 1
When i=3
Index = (7+2*3)%10 = 3
When i=4
Index = (7+2*4)%10 = 5
When i=5
Index = (7+2*5)%10 = 7
When i=6
Index = (7+2*6)%10 = 9
When i=7
Index = (7+2*7)%10 = 1
When i=8
Index = (7+2*8)%10 = 3
When i=9
Index = (7+2*9)%10 = 5
Since we checked all the cases of i (from 0 to 9), but we do not find suitable
place to insert 7. Therefore, key 7 cannot be inserted in a hash table.
The next element is 12. The calculated index value of 12 is 7 which is
already occupied by some another key value. So, we will use double
hashing technique to find the free location. The value of v is 7.
Now, substituting the values of u and v in (u+v*i)%m
When i=0
Index = (7+7*0)%10 = 7
When i=1
Index = (7+7*1)%10 = 4
Since the location 4 is empty; therefore, the key 12 is inserted at the index
4.
The final hash table would be: