KCS301 Data Structure UNIT 3 Searching Sorting Hashing

10/17/2022
UNIT-3 Searching, Sorting and

Hashing
Table of Contents
Searching:
Concept of Searching
Sequential Search
Index Sequential Search
Binary Search
Sorting:
Concept of Sorting
Bubble Sort
Selection Sort
Insertion Sort
Quick Sort
Merge Sort
Heap Sort
Radix Sort
Hashing:
Concept of Hashing
Collision Resolution Techniques used in Hashing
3 Dr. Sunil Kumar, CSE Dept., MIET Meerut
1
10/17/2022
Concept of Searching
Searching in data structure refers to the process
of finding location LOC of an element in a list.
This is one of the important parts of many data
structures algorithms, as one operation can be
performed on an element if and only if we find it.
Why do we need Searching?

Searching is one of the core computer science operation.
We know that today’s computers store a lot of
information.
To retrieve this information proficiently we need very
efficient searching algorithms.
2
10/17/2022
Types of Searching
Many different searching techniques exist and the
most commonly used searching techniques are:
3
10/18/2022
Linear Search
Linear search or sequential search is a
method for finding a particular value in a
list that consists of checking every one of
its elements, one at a time and in sequence,
until the desired one is found.
How Linear Search works

Linear search in an array is usually programmed
by stepping up an index variable until it reaches
the last index.
This normally requires two comparisons for each
list item:
One to check whether the index has reached the end of
the array, and
Another one to check whether the item has the desired
value.
1
10/18/2022
Linear Search Algorithm

Repeat For J = 1 to N
If (ITEM == A[J]) Then
Print: ITEM found at location J
Return [End of If]
[End of For Loop]
If (J > N) Then
Print: ITEM doesn’t exist [End of If]
Exit
#include<stdio.h> printf("\nEnter the element to be

searched: "); scanf("%d",&item);
#include<conio.h> void main()
{ for(i=1; i<=n; i++)
int i ,n, item, a[20]; {
clrscr( ); if(a[i]==item)
{
printf("\n%d is present at position
printf("\nEnter no of elements: "); %d", a[i], i);
scanf("%d",&n); break;
}
printf("\nEnter %d elements: ",n); }
for(i=1; i<=n; i++)
{ if(i>n)
printf("\n Element is not present.");
scanf("%d",&a[i]);
getch( );
} }
2
10/18/2022
Complexity of Linear Search

Linear search on a list of n elements. In the worst case, the
search must visit every element once. This happens when
the value being searched for is either the last element in the
list, or is not in the list.
However, on average, assuming the value searched for is in
the list and each list element is equally likely to be the value
searched for, the search visits only n/2 elements. In best
case the array is already sorted.
Indexed Sequential Search

An index file can be used to effectively overcome the
problem associated with sequential files and to speed up the
key search.
In this searching method, first of all, an index file is created,
that contains some specific group or division of required
record when the index is obtained, then the partial indexing
takes less time cause it is located in a specified group.
Note: When the user makes a request for specific records it will
find that index group first where that specific record is recorded.
3
10/18/2022
Indexed Sequential Search…

Characteristics of Indexed Sequential Search:
In Indexed Sequential Search a sorted index is set aside in addition to the
array.
Each element in the index points to a block of elements in the array or
another expanded index.
The index is searched 1st then the array and guides the search in the array.
Note: Indexed Sequential Search actually does the indexing multiple times,
like creating the index of an index.
Indexed Sequential Search
Advantages:
More efficient
Time required is less
4
10/18/2022
Binary Search
A binary search or half-interval search algorithm
finds the position of a specified input value (the
search "key") within an array sorted by key value.
For binary search, the array should be arranged in
ascending or descending order.
How Binary Search Works

Searching a sorted collection is a common task.
A dictionary is a sorted list of word definitions.
Given a word, one can find its definition. A
telephone book is a sorted list of people's
names, addresses, and telephone numbers.
Knowing someone's name allows one to quickly
find their telephone number and address.
5
10/18/2022
How Binary Search Works

• If the array is sorted, then we can apply the binary
search technique.
number
0 1 2 3 4 5 6 7 8
5 12 17 23 38 44 77 84 90
• The basic idea is straightforward. First search the

value in the middle position. If key X is less than this
value, then search the middle of the left half next. If
X is greater than this value, then search the middle
of the right half next. Continue in this manner.
Sequence of Successful Search

low high mid
search( 44 )
#1 0 8 4
mid =  low + high

 2 
0 1 2 3 4 5 6 7 8
5 12 17 23 38 44 77 84 90
low mid high
38 < 44 low = mid+1 = 5
6
10/18/2022

low high mid
search( 44 )
#1 0 8 4
6
mid =  low + high
#2 5 8
 2 
0 1 2 3 4 5 6 7 8
5 12 17 23 38 44 77 84 90
low mid high

high = mid-1=5 44 < 77

low high mid
search( 44 )
#1 0 8 4
6
#2 5 8
#3 5 5 5  2 
0 1 2 3 4 5 6 7 8
5 12 17 23 38 44 77 84 90
Successful Search!! low high

44 == 44 mid
7
10/18/2022
Sequence of Unsuccessful Search

low high mid
search( 45 )
#1 0 8 4
6
#2 5 8
#3 5 5 5  2 
#4 6 5
0 1 2 3 4 5 6 7 8
5 12 17 23 38 44 77 84 90
Unsuccessful Search
high low
no more elements to search low > high
8
10/18/2022
Concept of Sorting
Sorting is nothing but arrangement/storage of data in
sorted order, it can be in ascending or descending
order.
The term Sorting comes into picture with the term
Searching.
There are so man y things in our real life that we need
to search, like a particular record in database, roll
numbers in merit list, a particular telephone number,
any particular page in a book etc.
Internal and External Sorting

Any sort algorithm that uses main memory
exclusively during the sorting is called as
internal sort.
Internal sorting is faster than external
sorting.
1
10/18/2022
Internal Sorting
Bubble Sort
Selection Sort
Insertion Sort
Quick Sort
Heap Sort
Radix Sort
Bucket Sort
Shell Sort
External Sorting
Any sort algorithm that uses external
memory, such as tape or disk, during the
sorting is called as external sorting.
Merge sort is an example of external

sorting.
2
10/18/2022
Stable and Not Stable Sorting

If a sorting algorithm, after sorting the contents, does not
change the sequence of similar content in which they
appear, it is called stable sorting.
If a sorting algorithm, after sorting the contents,

changes the sequence of similar content in which
they appear, it is called unstable sorting.
Bubble Sort
Bubble sort is a simple sorting technique.
This sorting technique is comparison-based algorithm
in which each pair of adjacent elements is compared
and the elements are swapped if they are not in order.
This sorting is not suitable for large data sets as its
average and worst case complexity are of Ο(n2)
where n is the number of items.
3
10/18/2022
Bubble Sort...
1) Starting with the first element (index = 0), compare
the current element with the next element of the
array.
2) If the current element is greater than the next
element of the array, swap them.
3) If the current element is less than the next element,
move to the next element. Repeat Step 1.
Bubble Sort
Let us take the array of numbers "5 1 4 2 8", and sort the array from lowest number to
greatest number using bubble sort. In each step, elements written in bold are being
compared. Three passes will be required.
First Pass
( 5 1 4 2 8 ) → ( 1 5 4 2 8 ), Here, algorithm compares the first two elements, and swaps since 5 > 1.
( 1 5 4 2 8 ) → ( 1 4 5 2 8 ), Swap since 5 > 4
( 1 4 5 2 8 ) → ( 1 4 2 5 8 ), Swap since 5 > 2
( 1 4 2 5 8 ) → ( 1 4 2 5 8 ), Now, since these elements are already in order (8 > 5), algorithm does not swap
them.
Second Pass
(14258)→(14258)
( 1 4 2 5 8 ) → ( 1 2 4 5 8 ), Swap since 4 > 2
(12458)→(12458)
(12458)→(12458)
Now, the array is already sorted, but the algorithm does not know if it is completed. The algorithm needs
one whole pass without any swap to know it is sorted.
Third Pass
(12458)→(12458)
(12458)→(12458)
(12458)→(12458)
(12458)→(12458)
4
10/18/2022
Bubble Sort
Bubble Sort
5
10/18/2022
Algorithm
Bubble Sort (A, N)
Here A is an array with N elements. This algorithm
sorts the elements in the array A.
Step 1: Repeat Steps 2 and 3 for k = 1 to N-1.
Step 2: Set PTR=1.
Step 3: Repeat while PTR<= N-k:
a) If A[PTR]> A[PTR+1], then:
Swap A[PTR] and A[PTR+1].
b) Set PTR=PTR+1.
Step 4: EXIT
6
10/19/2022
Complexity of Bubble Sort

Algorithm
In Bubble Sort, n-1 comparisons will be done in 1st
pass, n-2 in 2nd pass, n-3 in 3rd pass and so on. So
the total number of comparisons will be:
F(n)=(n-1)+(n-2)+……………+2+1=n(n-1)/2
= O(n 2 )
Selection Sort
Selection sorting is conceptually the simplest
sorting algorithm.
This algorithm first finds the smallest element in
the array and exchanges it with the element in
the first position, then find the second smallest
element and exchange it with the element in the
second position, and continues in this way until
the entire array is sorted.
1
10/19/2022
How Selection Sort Works
2
10/19/2022
Algorithm
Selection Sort (A, N)
This algorithm sorts the array A with N elements.
Step 1: Repeat Steps 2 and 3 for K = 1 to N-1:
Step 2: CALL MIN(A, K, N, LOC)
Step 3: SWAP A[K] with A[LOC]
Set TEMP= A[K],
A[K]= A[LOC],
A[LOC]=TEMP.
Step 4: EXIT
Algorithm...
Procedure MIN (A, K, N, LOC)
This procedure finds the location LOC of the smallest element
A[K], A[K+1],..., A[N], where A is an array.
Step 1: Set MIN= A[K] and LOC=K.
Step 2: Repeat Steps 2 for j=K+1, K+2,...,N:
If MIN> A[j], then set MIN=A[j] and LOC=j;
Step 3: Return.
3
10/19/2022
Complexity of Selection Sort

Algorithm
• The number of comparison in the selection sort
algorithm is independent of the original order of the
element. That is there are n-1 comparison during
PASS 1 to find the smallest element, there are n-2
comparisons during PASS 2 to find the second
smallest element, and so on. Accordingly
F(n)=(n-1)+(n-2)+……………+2+1=n(n-1)/2
= O(n 2 )
4
10/31/2022
Quick Sort
Quick Sort, as the name suggests, sorts any list very
quickly.
Quick sort is not stable search, but it is very fast
and requires very less additional space.
It is based on the rule of Divide and Conquer
(also called partition-exchange sort).
This algorithm divides the list into three main
parts:
Elements less than the Pivot
Pivot element
Elements greater than the pivot element
Quicksort I: Basic idea

Pick some number p from the array
Move all numbers less than p to the beginning of the array
Move all numbers greater than (or equal to) p to the end of the array
Quicksort the numbers less than p
Quicksort the numbers greater than or equal to p
numbers less p numbers greater than or

than p equal to p
1
10/31/2022
Partitioning (Quicksort II)

A key step in the Quicksort algorithm is partitioning
the array
We choose some (any) number p in the array to use as
a pivot
We partition the array into three parts:
numbers less p numbers greater than or

than p equal to p
Quick Sort
2
10/31/2022
How Quick Sort Works
Algorithm
QUICKSORT(A, p, r)
1. if p < r
2. then q PARTITION(A, p, r)
3. QUICKSORT(A, p, q-1)
4. QUICKSORT(A, q + 1,r)
3
10/31/2022
Partitioning the array

PARTITION(A, p, r)
1. x A[r]
2. i p - 1
3. for j = p to r - 1
4. if A[j] ≤ x
5. then i= i + 1
6. exchange A[i] A[j]
7. exchange A[i+1] A[r]
8. return i+1
Complexity of Quick Sort

Algorithm
• The Worst Case occurs when the list is sorted. Then the first
element will require n comparisons to recognize that it
remains in the first position.
• Furthermore, the first sublist will be empty, but the second
sublist will have n-1 elements. Accordingly the second
element require n-1 comparisons to recognize that it remains
in the second position and so on.
F(n)=(n-1)+(n-2)+……………+2+1=n(n-1)/2
= O(n 2 )
4
10/31/2022
Insertion Sort
• It is a simple sorting algorithm that builds the final
sorted array (or list) one item at a time.
• This algorithm is less efficient on large lists than more
advanced algorithms such as quicksort, heap sort, or
merge sort.
• However, insertion sort provides several advantages:
• Simple implementation
• Efficient for small data sets
• Stable; i.e., does not change the relative order of
elements with equal keys.
• In-place; i.e., only requires a constant amount O(1) of
additional m e m o r y space.
Insertion Sort
1
10/31/2022
Insertion Sort
1. We start by making the second element of the given array,
i.e. element at index 1, the key. The key element here is
the new card that we need to add to our existing sorted set
of cards(remember the example with cards above).
2. We compare the key element with the element(s) before
it, in this case, element at index 0:
If the key element is less than the first element, we insert
the key element before the first element.
If the key element is greater than the first element, then we insert it
after the first element.
3. Then, we make the third element of the array as key and
will compare it with elements to it's left and insert it at
the right position.
4. And we go on repeating this, until the array is sorted.
Insertion Sort
2
10/31/2022
Algorithm
Algorithm : Insertion Sort (A, N)
Step 1: Repeat Steps 2 to 5 for K = 1 to N-1
Step 2: SET TEMP = A[K]
Step 3: SET J = K-1
Step 4: Repeat while TEMP <=A[J]
(a) SET A[J + 1] = A[J]
(b) SET J = J - 1
Step 5: SET A[J + 1] = TEMP
Step 6: EXIT
3
10/31/2022
Complexity of Insertion Sort

• The number f(n) of comparisons in the insertion sort
algorithm can be easily computed. First of all, the worst
case occurs when the array A is in reverse order and the
inner loop must use the maximum number K-1 of
comparisons. Hence
F(n)=(n-1)+(n-2)+……………+2+1=n(n-1)/2
= O(n 2 )
• Furthermore, One can show that, on the average, there will
b e approximately (K-1)/2 comparisons in the inner loop.
Accordingly, for the average case. F(n)=O(n 2 )
• Thus the insertion sort algorithm is a ver y slow algorithm
when n is very large.
4
11/1/2022
Merge Sort
Merge Sort is based on the rule of Divide and
Conquer. But it doesn't divide the list into two
halves.
In merge sort, the unsorted list is divided into
N sub-lists, each having one element, because
a list of one element is considered sorted.
Then, it repeatedly merge these sub lists, to
produce new sorted sub lists, and at lasts one
sorted list is produced.
Merge Sort…
• DIVIDE: Divide the unsorted list into two sub
lists of about half the size.
• CONQUER: Sort each of the two sub-lists
recursively. If they are small enough just solve
them in a straight forward manner.
• COMBINE: Merge the two-sorted sub-lists back
into one sorted list.
1
11/1/2022
Merge Sort...
Merge Sort is quite fast, and has a time
complexity of O(n log n).
It is also a stable sort, which means the
equal elements are ordered in the same order
in the sorted list.
How Merge Sort Works
2
11/1/2022
Another Example
Algorithm
ALGO.: Merge-Sort(A, p, r)
1. if p < r // check for base case
2. then q (p + r)/2 // divide step
3. Merge-Sort (A, p, q) // conquer step
4. Merge-Sort (A, q + 1, r) // conquer step
5. Merge (A, p, q, r) // conquer step
3
11/1/2022
Algorithm...
MERGE (A, p, q, r ) 8. L[n1 + 1] ← ∞
1. n1 ← q − p + 1 9. R[n2 + 1] ← ∞
2. n2 ← r − q 10. i←1
3. Create arrays L[1 . . n1 + 1] 11. j←1
12. for k ← p to r
and R[1 . . n2 + 1]
13. do if L[i ] ≤ R[ j]
4. for i ← 1 to n1
14. then A[k] ← L[i]
5. do L[i] ← A[p + i − 1]
15. i←i+1
6. for j ← 1 to n2
16. else A[k] ← R[j]
7. do R[j] ← A[q + j ]
17. j←j+1
MERGE SORT EXAMPLE :

p q q+1 r
1 5 7 8 2 4 6 9
1 2 3 4 5 6 7 8
L 1 5 7 8 infinity 2 4 6 9 infinity
R
i=1 2 3 4 5 j=1 2 3 4 5
K 1
4
11/1/2022

p q q+1 r
1 5 7 8 2 4 6 9
R
i=1 2 3 4 5 j=1 2 3 4 5
K 1 2

p q q+1 r
1 5 7 8 2 4 6 9
R
i=1 2 3 4 5 j=1 2 3 4 5
K 1 2 4
5
11/1/2022

p q q+1 r
1 5 7 8 2 4 6 9
R
i=1 2 3 4 5 j=1 2 3 4 5
K 1 2 4 5

p q q+1 r
1 5 7 8 2 4 6 9
R
i=1 2 3 4 5 j=1 2 3 4 5
K 1 2 4 5 6
6
11/1/2022

p q q+1 r
1 5 7 8 2 4 6 9
R
i=1 2 3 4 5 j=1 2 3 4 5
K 1 2 4 5 6 7

p q q+1 r
1 5 7 8 2 4 6 9
R
i=1 2 3 4 5 j=1 2 3 4 5
K 1 2 4 5 6 7 8
7
11/1/2022

p q q+1 r
1 5 7 8 2 4 6 9
R
i=1 2 3 4 5 j=1 2 3 4 5
K 1 2 4 5 6 7 8 9
Complexity of Merge Sort

Algorithm
Let f(n) denote the numb er of comparisons needed to sort an
n-element array A using merge-sort algorithm. The algorithm
requires at most log n p asses. Each pass merges a total of n
elements and each pass require at most n comparisons.
Thus for both the worst and average case
F(n) ≤ n log n
Thus the time co mplexity of Merge Sort is O(n Log n) in all
3 cases (worst, average and best) as merge sort always
divides the array in two halves and take linear time to merge
two halves.
8
11/1/2022
Heap Sort
Heap Sort is one of the best sorting methods being
in-place and with no quadratic worst-case
scenarios.
Heap sort algorithm is divided into two basic
parts:
Creating a Heap of the unsorted list.
Then a sorted array is created by repeatedly
removing the largest/smallest element from the
heap, and inserting it into the array. The heap
is reconstructed after each removal.
Types of Heap
Max Heap
Min Heap
1
11/1/2022
1-Max Heap
Max-heap Definition:
• is a complete binary tree in which the value in
each internal node is greater than or equal to
the values in the children of that node.
Max-heap property:
The key of a node is ≥ than the keys of its children.
2-Min heap :
Min-Heap Definition:
is a complete binary tree in which the value in
each internal node is lower than or equal to the
values in the children of thatnode.
Min-Heap property:
The key of a node is <= than the keys of its
children.
2
11/1/2022
Example
Example...
3
11/1/2022
Example...
Example...
4
11/1/2022
Example...
Example...
5
11/1/2022
Example
6
11/2/2022
Algorithm
Convert an array A[1 … n] into a max-heap (n = length[A])
The elements in the subarray A[(n/2+1) .. n] are leaves
Apply MAX-HEAPIFY on elements between 1 and n/2
Algo: BUILD-MAX-HEAP(A) 1
4
1. n = length[A]
2 3
2. for i ← n/2 down to 1 1 3
4 5 6 7
3. do MAX-HEAPIFY(A, i, n) 8
2 9 10
16 9 10
14 8 7
A: 4 1 3 2 16 9 10 14 8 7
Maintaining the Heap Property

Assumptions: Algo: MAX-HEAPIFY(A, i, n)
Left and Right 1. l ← LEFT(i)
subtrees of i are 2. r ← RIGHT(i)
max-heaps
3. if l ≤ n and A[l] > A[i]
A[i] may be
4. then largest ←l
smaller than its
children 5. else largest ←i
6. if r ≤ n and A[r] > A[largest]
7. then largest ←r
8. if largest ≠ i
9. then exchange A[i] ↔ A[largest]
10. MAX-HEAPIFY(A, largest, n)
95
1
11/2/2022
HEAP-EXTRACT-MAX
Algo: HEAP-EXTRACT-MAX(A, n)
1. if n < 1
2. then error “heap underflow”
3. max ← A[1]
4. A[1] A[n]
5. MAX-HEAPIFY(A, 1, n-1)
6. return max
96
Complexity of Heap Sort

Algorithm
The algorithm has two phases, and we analyze the complexity
of each phase separately.
Phase 1.Since H is complete tree, its depth is bounded by
log2m where m is the number of elements in H. Accordingly,
the total number g(n) of comparisons to insert the n elements
of A into H is bounded as g(n) ≤ n log2n
Phase 2.Reheaping uses 4 comparisons to move the node L one
step do wn the tree H. Since the depth cannot exceeds log2m , it
uses 4log2m comparisons to find the appropriate place of L in
the tree H. h(n)≤4nlog2n
Thus each phase requires time proportional to nlog2n, the
running time to sort n elements array A would be nlog2n
2
11/7/2022
Radix Sort
A multiple pass distribution sort algorithm that
distributes each item to a bucket according to part of
the item's key beginning with the least significant
part of the key.
After each pass, items are collected from the buckets,
keeping the items in order, then redistributed
according to the next most significant part of the key
and so on.
Radix Sort
The idea is to consider the key one character at a
time and to divide the entries, not into two sub
lists, but into as many sub-lists as there are
possibilities for the given character from the key.
If our keys, for example, are words or other
alphabetic strings, then we divide the list into 26
sub-lists at each stage.
That is, we set up a table of 26 lists and distribute
the entries into the lists according to one of the
characters in the key.
1
11/7/2022
Example
Example: Sort the numbers 348, 143, 361, 423, 538, 128, 321, 543, 366.
Input 0 1 2 3 4 5 6 7 8 9
348 348
143 143
361 361
423 423
Pass 1
538 538
128 128
321 321
543 543
366 366
Example...
Input 0 1 2 3 4 5 6 7 8 9
361 361
321 321
143 143
423 423
Pass 2
543 543
366 366
348 348
538 538
128 128
2
11/7/2022
Example...
Input 0 1 2 3 4 5 6 7 8 9
321 321
423 423
128 128
538 538
Pass 3
143 143
543 543
348 348
361 361
366 366
Example:
Sort the numbers 551, 12, 346, 311 using Radix sort.
3
11/7/2022
Algorithm
Step 1 - Define 10 queues each representing a bucket for
each digit from 0 to 9.
Step 2 - Consider the least significant digit of each number
in the list which is to be sorted.
Step 3 - Insert each number into their respective queue
based on the least significant digit.
Step 4 - Group all the numbers from queue 0 to queue 9 in
the order they have inserted into their respective queues.
Step 5 - Repeat from step 3 based on the next least
significant digit.
Step 6 - Repeat from step 2 until all the numbers are
grouped based on the most significant digit.
Complexity of Radix Sort

The list A of n elements A 1 , A 2 , … … … … … A n is given.
Let d denote the radix(e.g d=10 for decimal digits, d=26
for letters and d=2 for bits) and each item A i is
represented by means of s of the digits:
Ai = di1 d i 2 … … … … … … . dis
The radix sort require s passes, the number of digits in
each item . Pass K will compare each digit with each of the
d digits. Hence C(n)≤ d*s*n
4
07-11-2022
Introduction to Hashing
Table of Contents
• Suppose that we want to store 10,000 students records (each with a 5-digit
• Concept of Hashing ID) in a given container.
• Collision Resolution Techniques used in
Hashing : A linked list implementation would take O(n) time.
– Open Hashing: Closed Addressing A height balanced tree would take O(log n) access time.
• Separate Chaining
– Closed Hashing: Open Addressing Using an array of size 100,000 would take O(1) access time but will
• Linear Probing lead to a lot of space wastage.
• Quadratic Probing
• Is there some way that we could get O(1) access time without wasting a lot
• Double Hashing of space?
• The answer is Hashing.

Dr. Sunil Kumar, CSE Dept., MIET Meerut 2
3
1
07-11-2022
Introduction to Hashing... Hash Table

Hashing is a technique used for performing insertions, deletions and finds in
constant average time O(1). The ideal hash table structure is an array of some fixed size,
containing the items.
The techniques employed here is to compute location of desired record to retrieve
it in a single access or comparison. A stored item needs to have a data member, called key, that will be
used in computing the index value for the item.
This data structure, however, is not efficient in operations that require any
ordering information among the elements, such as findMin, findMax and • Key could be an integer, a string, etc
printing the entire table in sorted order. e.g. a name or Id that is a part of a large employee structure
The size of the array is TableSize.
Applications: The items that are stored in the hash table are indexed by values from
• Database Systems 0 to TableSize – 1.
• Symbol table for compilers
Each key is mapped into some number in the range 0 to TableSize – 1.
• Data Dictionaries
• Browser caches The mapping is called a hash function.
Dr. Sunil Kumar, CSE Dept., MIET Meerut 4 Dr. Sunil Kumar, CSE Dept., MIET Meerut 5
2
07-11-2022
Hash Functions (cont’d)

Example
• A hash function (h) is a function which transforms a key from a set, K,
into an index in a table of size n:
h: K -> {0, 1, ..., n-2, n-1}
• A key can be a number, a string, a record etc.
• The size of the set of keys, |K|, to be relatively very large.
• It is possible for different keys to hash to the same array location. This
situation is called collision and the colliding keys are called synonyms.
• A common hash function is:
h(x) = x mod SIZE
• if key = 27 and SIZE =10 then
hash address = 27 % 10 = 7
3
07-11-2022
• A good hash function should:

· Minimize collisions. Collision Resolution
· Be easy and quick to compute.
· Distribute key values evenly in the hash table.
· Use all the information provided in the key.
• If, when an element is inserted, it hashes to the same value as an already
inserted element, then we have a collision and need to resolve it.
i.e. For any two keys k1 and k2,
H(k1) = H(k2) = β
• There are several methods for dealing with this:
– Open Hashing: Closed Addressing
Separate Chaining
– Closed Hashing: Open Addressing
• Linear Probing
• Quadratic Probing
• Double Hashing
4
07-11-2022
Example
Separate Chaining Keys: 0, 1, 4, 9, 16, 25, 36, 49, 64, 81
hash(key) = key % 10.

• The idea is to keep a list of all elements that hash to the same value.
– The array elements are pointers to the first nodes of the lists. 0 0
1 81 1
– A new item is inserted to the front of the list.
2
• Advantages: 3
– Better space utilization for large items. 4 64 4
– Simple collision handling: searching linked list. 5 25
– Overflow: we can store more items than the hash table size. 6 36 16
– Deletion is quick and easy: deletion from the linked list. 7

8
9 49 9
Exercise: Represent the keys {89, 18, 49, 58, 69, 78} in hash table using separate chaining,
Dr. Sunil Kumar, CSE Dept., MIET Meerut 10 hash(key) = key % 10. 11
31
5
07-11-2022
Example: Load the keys {23, 13, 21, 14, 7, 8, and 15} in this order, in a hash table
of size 7 using separate chaining with the hash function: h(key)=key% 7.
Dr. Sunil Kumar, CSE Dept., MIET Meerut 12
6
08-11-2022
Collision Resolution with Open Addressing Open Addressing

• Separate chaining has the disadvantage of using linked lists. • In open addressing, there are three common collision
– Requires the implementation of a second data structure. resolution strategies:
• In an open addressing hashing system, all the data go inside
– Linear Probing
the table.
– Thus, a bigger table is needed.
– Quadratic Probing
– If a collision occurs, alternative cells are tried until an empty cell – Double Hashing
is found.
1
08-11-2022
Linear Probing Linear Probing Example

– Searches the hash table sequentially, starting from the • h(k, i) = (h(k) + i ) mod m (i is probe number, initially, i = 0)
original location specified by the hash function • Insert keys: 18 41 22 44 59 32 31 73 (in that order)
– Possible problem 73
How many collisions occur in this case? 11
44 32
31
• Primary clustering 0 1 2 3 4 5 6 7 8 9 10 11 12
h(k) = k mod 13
41 18 44 59 32 22 31 73 m = 13
If a collision occurs, when j = h(k), we try next at A[(j+1)mod m], then

A[(j+2)mod m], and so on. When an empty position is found the item is inserted.
Linear probing is easy to implement, but leads to clustering (long run of occupied slots).
Clusters can lead to poor performance, both for inserting and finding keys.
Dr. Sunil Kumar, CSE Dept., MIET Meerut 15 Dr. Sunil Kumar, CSE Dept., MIET Meerut
2
08-11-2022
h(k) = (h(k) + i ) mod m,

when x=49: i= 0 to m-1
Another Example: Linear Probing h(49)=49%10=9 (Collision )
so insert key 49 in hash-table in next
possible vacant location of 9 is 0 0 49
Example: Insert keys {89, 18, 49, 58, 69, 78} with the hash
1 58
function: h(x)=x mod 10 using linear probing. Use table size 10. when x=58:
h(58)=58%10=8 (Collision) 2 69
insert key 58 in hash-table in next 3 78
possible vacant location of 8 is 1
Solution: (since 9, 0 already contains values). 4
when x=89: 5
h(89)=89%10=9 when x=69:
6
h(69)=69%10=9 (Collision )
insert key 89 in hash-table in location 9 insert key 69 in hash-table in next 7
when x=18: possible vacant location of 9 is 2 8 18
(since 0, 1 already contains values).
h(18)=18%10=8 9 89
Fig. Hash table with keys
insert key 18 in hash-table in location 8 when x = 78:
Using linear probing
h(78) = 78 % 10 = 8 (Collision)
search next vacant slot in the table
which is 3 (since 0,1,2 contain values)
insert 78 at location 3.
Dr. Sunil Kumar, CSE Dept., MIET Meerut 17 18
3
08-11-2022
Quadratic Probing:
Quadratic Probing
Quadratic probing is a collision resolution method that eliminates
the primary clustering problem take place in a linear probing. • In general, searches the hash table beginning with the original
Compute: hash value = h(x) = x % table size location that the hash function specifies and continues at
When collision occur then the quadratic probing works as follows: increments of 12, 22, 32, and so on
(hash value + 12)% table size,
• Possible problem
if there is again collision occur then there exist rehashing.
(hash value + 22)%table size –Secondary clustering
if there is again collision occur then there exist rehashing.
(hash value = 32)% table size
In general in ith collision
hi(x)=(hash value +i2)%size
4
08-11-2022
Example: Insert keys {89, 18, 49, 58, 69 78} in order with the hash-table size 10 using
quadratic probing. Hash function: h(x)=x mod 10 • when x=69:
Solution: – h(69)=69%10=9 (Collision )
when x=89: – so use following hash function, h1(69)=(9 + 1)%10=0
h(89)=89%10=9 0 49
– again collision occurs use again the following hash function ,
insert key 89 in hash-table in location 9
1 – h2(69)=(9+ 22)%10=3
when x=18: 2 58 – insert key 69 in hash-table in location 3
h(18)=18%10=8 • when x=78:
insert key 18 in hash-table in location 8 3
– h(78)=78%10=8 (Collision )
4
when x=49: – so use following hash function, h1(78)=(8 + 1)%10=9 ; again collision occurs
h(49)=49%10=9 (Collision ) 5 – use again the following hash function ,
so use following hash function,
h1(49)=(9 + 1)%10=0
6 – h2(78)=(8+ 22)%10=2 ; again collision occurs, compute following step
hence insert key 49 in hash-table in location 0 7 – h3(78)=(8+ 32)%10=7
8 18 – insert key 58 in hash-table in location 7
when x=58:
h(58)=58%10=8 (Collision ) 9 89
so use following hash function, Fig. Hash table with keys
h1(58)=(8 + 1)%10=9 Using quadratic probing
again collision occur use again the following hash function ,
h2(58)=(8+ 22)%10=2
insert key 58 in hash-table in location 2
5
08-11-2022
Double Hashing Example

Double Hashing
• h1(K) = K mod m
• Uses two hash functions
• h2(K) = K mod (m – 1)
• Searches the hash table starting from the location that one hash
function determines and considers every nth location, where n is • The ith probe is h(k, i) = (h1(k) + h2(k) ⋅ i) mod m
determined from a second hash function
• Insert keys: 18 41 22 44 59 32 31 73 (in that order)
44 How many collisions occur in this case? 2
31
0 1 2 3 4 5 6 7 8 9 10 11 12
44 41 18 32 59 73 22 31 m = 13
44 % 13 = 5 (collision), next try: (5 + (44 % 12)) % 13 = 13 % 13 = 0

31 % 13 = 5 (collision), next try: (5 + (31 % 12)) % 13 = 12 % 13 = 12
Dr. Sunil Kumar, CSE Dept., MIET Meerut 23 Dr. Sunil Kumar, CSE Dept., MIET Meerut

KCS301 Data Structure UNIT 3 Searching Sorting Hashing

Uploaded by

Copyright:

Available Formats

KCS301 Data Structure UNIT 3 Searching Sorting Hashing

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

KCS301 Data Structure UNIT 3 Searching Sorting Hashing

Uploaded by

Copyright:

Available Formats

10/17/2022

UNIT-3 Searching, Sorting and

4 Dr. Sunil Kumar, CSE Dept., MIET Meerut

Why do we need Searching?

5 Dr. Sunil Kumar, CSE Dept., MIET Meerut

6 Dr. Sunil Kumar, CSE Dept., MIET Meerut

7 Dr. Sunil Kumar, CSE Dept., MIET Meerut

How Linear Search works

8 Dr. Sunil Kumar, CSE Dept., MIET Meerut

Linear Search Algorithm

9 Dr. Sunil Kumar, CSE Dept., MIET Meerut

#include<stdio.h> printf("\nEnter the element to be

10 Dr. Sunil Kumar, CSE Dept., MIET Meerut

Complexity of Linear Search

11 Dr. Sunil Kumar, CSE Dept., MIET Meerut

Indexed Sequential Search

12 Dr. Sunil Kumar, CSE Dept., MIET Meerut

Indexed Sequential Search…

13 Dr. Sunil Kumar, CSE Dept., MIET Meerut

Indexed Sequential Search

14 Dr. Sunil Kumar, CSE Dept., MIET Meerut

15 Dr. Sunil Kumar, CSE Dept., MIET Meerut

How Binary Search Works

16 Dr. Sunil Kumar, CSE Dept., MIET Meerut

How Binary Search Works

• The basic idea is straightforward. First search the

Sequence of Successful Search

mid =  low + high

low mid high

38 < 44 low = mid+1 = 5

18 Dr. Sunil Kumar, CSE Dept., MIET Meerut

Sequence of Successful Search

low mid high

19 Dr. Sunil Kumar, CSE Dept., MIET Meerut

Sequence of Successful Search

Successful Search!! low high

20 Dr. Sunil Kumar, CSE Dept., MIET Meerut

Sequence of Unsuccessful Search

21 Dr. Sunil Kumar, CSE Dept., MIET Meerut

26 Dr. Sunil Kumar, CSE Dept., MIET Meerut

Internal and External Sorting

27 Dr. Sunil Kumar, CSE Dept., MIET Meerut

28 Dr. Sunil Kumar, CSE Dept., MIET Meerut

Merge sort is an example of external

29 Dr. Sunil Kumar, CSE Dept., MIET Meerut

Stable and Not Stable Sorting

If a sorting algorithm, after sorting the contents,

30 Dr. Sunil Kumar, CSE Dept., MIET Meerut

31 Dr. Sunil Kumar, CSE Dept., MIET Meerut

32 Dr. Sunil Kumar, CSE Dept., MIET Meerut

34 Dr. Sunil Kumar, CSE Dept., MIET Meerut

35 Dr. Sunil Kumar, CSE Dept., MIET Meerut

36 Dr. Sunil Kumar, CSE Dept., MIET Meerut

Complexity of Bubble Sort

37 Dr. Sunil Kumar, CSE Dept., MIET Meerut

38 Dr. Sunil Kumar, CSE Dept., MIET Meerut

How Selection Sort Works

39 Dr. Sunil Kumar, CSE Dept., MIET Meerut

40 Dr. Sunil Kumar, CSE Dept., MIET Meerut

41 Dr. Sunil Kumar, CSE Dept., MIET Meerut