Online Classes: Data Structures

GATE ONLINE CLASSES
ON
DATA STRUCTURES
Presented by
Chandra Sekhar Sanaboina
Assistant Professor
Department of CSE
University College of Engineering Kakinada
Jawaharlal Nehru Technological University Kakinada
GATE ONLINE CLASSES

Day – 10 Lecture Notes
on
DATA STRUCTURES
GATE ONLINE CLASSES

GRAPH REPRESENTATIONS
GATE ONLINE CLASSES

Graph Representations
 Adjacency Matrix
 Adjacency Lists
GATE ONLINE CLASSES

ADJACENCY MATRIX
GATE ONLINE CLASSES

Adjacency Matrix
 Let G=(V,E) be a graph with n vertices.
 The adjacency matrix of G is a two-dimensional n by n array, say adj_mat
 If the edge (vi, vj) is in E(G), adj_mat[i][j]=1
 If there is no such edge in E(G), adj_mat[i][j]=0
 The adjacency matrix for an undirected graph is symmetric
 The adjacency matrix for a digraph need not be symmetric
GATE ONLINE CLASSES

Examples on Adjacency Matrix
0 4
0
0 2 1 5
1 2 1 3 6
3 7
G2
G 2 0 1 1 0 0 0 0 0
1 1 0
0 1  0 0 1 0 0 0 
1 1 0 1 0 1 0 0 1 0 0 0 0
1 0 1 1    
0 1 1 0 0 0 0

0
  1 0 1  0 0 0 0 0 1 0 0
 
1 1 0 1 0 0 0 0 0 0 0 1 0 1 0
  0

0 0 0 0 1 0 1

1 1 1 0 Symmetric 
0 0 0 0 0 0 1 0

GATE ONLINE CLASSES
G4
Merits of Adjacency Matrix
 From the adjacency matrix, to determine the connection of vertices is

n 1
easy
 adj _ mat[i][ j ]
j 0
 For undirected graph, the degree of a vertex is sum of columns
 For a digraph, the row sum is the out-degree, while the column sum
is the in-degree n 1
n 1 outd ( vi )   A[i , j ]
ind (vi )   A[ j , i ] j 0
j 0
GATE ONLINE CLASSES

FEATURES OF ADJACENCY MATRIX
 Storage complexity: O(|V|2)
 Undirected graph: symmetric along main diagonal

 AT transpose of A
 Undirected: A=AT
 Directed Graph:
 In-degree of X: Sum along column X O(|V|)
 Out-degree of X: Sum along row X O(|V|)
 Very simple, good for small graphs

GATE ONLINE CLASSES
DEMERITS OF ADJACENCY MATRIX
 Many graphs in practical problems are sparse

 Not many edges --- not all pairs x,y have edge xy
 Matrix representation demands too much memory
 We want to reduce memory footprint

 Use sparse matrix techniques
GATE ONLINE CLASSES

ADJACENCY LISTS
GATE ONLINE CLASSES

Adjacency List
0 0 4
2 1 5
3 6
1 2 3
7
0 1 2 3
1 0 1 2
0 2
1 0 3
2 0 1 3
2 0 3
3 0 2 0
G1 3 1 2
0 1 4 5
1 0 2 1 5 4 6
2 6 5 7
G3 7 6 G4
2
GATE ONLINE CLASSES
An undirected graph with n vertices and e edges ==> n head nodes and 2e list nodes
INTERESTING OPERATIONS
degreeof a vertex in an undirected graph

–# of nodes in adjacency list
out-degreeof a vertex in a directed graph

–# of nodes in its adjacency list
in-degreeof a vertex in a directed graph

–traverse the whole data structure
GATE ONLINE CLASSES

FEATURES OF ADJACENCY LIST FEATURES
 Storage Complexity:
 O(|V| + |E|)
 In undirected graph: O(|V|+2*|E|) = O(|V|+|E|)
 Degree of node X:
 Out degree: Length of Adj[X] O(|V|) calculation
 In degree: Check all Adj[] lists O(|V|+|E|)
GATE ONLINE CLASSES

GRAPH TRAVERSALS
GATE ONLINE CLASSES

Graph Traversal / Grah Searching Techniques
 A traversal (search):
 An algorithm for systematically exploring a graph
 Visiting (all) vertices
 Until finding a goal vertex or until no more vertices
 Two types of Graph Traversal Techniques

– Depth First Search (DFS)
preorder tree traversal
– Breadth First Search (BFS)
level order tree traversal
GATE ONLINE CLASSES

BREADTH FIRST SEARCH
GATE ONLINE CLASSES

BREADTH-FIRST SEARCH
 One of the simplest algorithms
 Also one of the most important

 It forms the basis for MANY graph algorithms
GATE ONLINE CLASSES

BFS: LEVEL-BY-LEVEL TRAVERSAL
 Given a starting vertex s
 Visit all vertices at increasing distance from s

 Visit all vertices at distance k from s
 Then visit all vertices at distance k+1 from s
 Then ….
GATE ONLINE CLASSES

APPLICATIONS OF BREADTH FIRST TRAVERSAL
 Shortest Path and Minimum Spanning Tree for unweighted graph
 Peer to Peer Networks
 Crawlers in Search Engines
 Social Networking Websites
 GPS Navigation systems
 Broadcasting in Network
 In Garbage Collection
 Cycle detection in undirected graph
 To test if a graph is Bipartite
 Path Finding
 Finding all nodes within one connected component
 Prim’s Minimum Spanning Tree
 Dijkstra’s Single Source Shortest Path
GATE ONLINE CLASSES

BFS IN A BINARY TREE
2 8
1 3 6 10
5 2 8 1 3 6 10 7 9
7 9
GATE ONLINE CLASSES
ALGORITHM FOR BFS USING QUEUES
1. unmark all vertices in G

2. q  new queue
3. mark s
4. enqueue(q, s)
5. while (not empty(q))
6. curr  dequeue(q)
7. visit curr // e.g., print its data
8. for each edge <curr, V>
9. if V is unmarked
10. mark V
11. enqueue(q, V)
GATE ONLINE CLASSES
BFS Using Queues
A
B E
G C D
Queue: A F
Start with A. Put in the queue (marked red)
GATE ONLINE CLASSES

BFS Using Queues
A
B E
G C D
Queue: A B E
F
When we go to A, we put B and E in the queue
GATE ONLINE CLASSES

BFS Using Queues
A
B E
G C D
Queue: A B E C G
F
When we go to B, we put G and C in the queue
GATE ONLINE CLASSES

BFS Using Queues
A
B E
G C D
Queue: A B E C G D F
F
When we go to E, we put D and F in the queue
GATE ONLINE CLASSES

BFS Using Queues
A
B E
G C D
Queue: A B E C G D F
F
GATE ONLINE CLASSES

BFS Using Queues
A
B E
G C D
Queue: A B E C G D F F
GATE ONLINE CLASSES

BFS Using Queues
A
B E
G C D
GATE ONLINE CLASSES

BFS Using Queues
A
B E
G C D
GATE ONLINE CLASSES

FEATURES OF BFS
 Complexity: O(|V| + |E|)

 All vertices put on queue exactly once
 For each vertex on queue, we expand its edges
 In other words, we traverse all edges once
 BFS finds shortest path from s to each vertex

 Shortest in terms of number of edges
GATE ONLINE CLASSES

DEPTH FIRST SEARCH
GATE ONLINE CLASSES

DEPTH-FIRST SEARCH
 Again, a simple and powerful algorithm

 Given a starting vertex s
 Pick an adjacent vertex, visit it.
 Then visit one of its adjacent vertices
 …..
 Until impossible, then backtrack, visit another
GATE ONLINE CLASSES

APPLICATIONS OF DEPTH FIRST SEARCH
 Minimum spanning tree for undirected graphs

 Detecting cycle in a graph
 Path Finding
 Topological Sorting
 To test if a graph is bipartite
 Finding Strongly Connected Components of a graph
 Solving puzzles with only one solution
GATE ONLINE CLASSES

DFS Using Stacks
A
B E
G C D
Stack: A  Start with A.

Output: Nil  Mark A and Push A on to stack F
Next Step:
 Pop A and output
 Push the Adjacent Vertices of A (B, E) into Stack
GATE ONLINE CLASSES

DFS Using Stacks
A
B E
G C D
Stack: B E
Output: A Next Step: F
 Pick E (Top)
 Pop E and place it in output by pushing adjacent vertices of E (D, F) on
to the stack
GATE ONLINE CLASSES

DFS Using Stacks
A
B E
G C D
Stack: B D F
Output: AE Next Step: F
 Pick F (Top)
 Pop F and place it in output by pushing the unvisited adjacent vertices
of F (NIL) on to the stack
GATE ONLINE CLASSES

DFS Using Stacks
A
B E
G C D
Stack: B D
Output: AEF Next Step: F
 Pick D (Top)
 Pop D and place it in output by pushing the unvisited adjacent vertices
of D (NIL) on to the stack
GATE ONLINE CLASSES

DFS Using Stacks
A
B E
G C D
Stack: B
Output: AEFD Next Step: F
 Pick B (Top)
 Pop B and place it in output by pushing the unvisited adjacent vertices
of B (C, G) on to the stack
GATE ONLINE CLASSES

DFS Using Stacks
A
B E
G C D
Stack: CG
Output: AEFDB Next Step: F
 Pick G (Top)
 Pop G and place it in output by pushing the unvisited adjacent vertices
of G (NIL) on to the stack
GATE ONLINE CLASSES

DFS Using Stacks
A
B E
G C D
Stack: C
Output: AEFDBG Next Step:
F
 Pick C (Top)
 Pop C and place it in output by pushing the unvisited adjacent vertices
of C (NIL) on to the stack
GATE ONLINE CLASSES

DFS Using Stacks
A
B E
G C D
Stack: EMPTY
Output: AEFDBGC
F
Next Step:
 Stack Empty -> No elements to process -> Final Outpu
GATE ONLINE CLASSES

INTERESTING FEATURES OF DFS
 Complexity: O(|V| + |E|
GATE ONLINE CLASSES

DFS and BFS of a graph from given Linked List
Depth First Search: v0, v1, v3, v7, v4, v5, v2, v6
Breadth First Search: v0, v1, v2, v3, v4, v5, v6, v7
GATE ONLINE CLASSES

PART – 9
HASHING
 Agenda:
 Introduction to Hashing
 Steps Involved in Hashing
 Components of Hashing
 Hash Functions
 Requirements for Good Hash Functions
 Types of Hash Functions
 Need for Good Hash Function
 Hash Tables
 Collisions
 Collision Resolution Techniques
 Linear Probing
 Quadratic Probing
 Separate Chaining
GATE ONLINE CLASSES

Hashing
GATE ONLINE CLASSES

INTRODUCTION TO HASHING
 Hashing
 Hashing is a technique used for storing and retrieving information as quickly as
possible.
 Used to perform Optimal Searches
 Hashing is a technique that is used to uniquely identify a specific object from a

group of similar objects
 E.g: Student Registration Number in Universities, Books in Libraries etc.,
 An object is assigned a key to it to make searching easy
 To store the key/value pair, you can use a simple array like a data structure where
keys (integers) can be used directly as an index to store values
 Keys are large and cannot be used directly as an index you should use hashing
GATE ONLINE CLASSES
Steps Involved in Hasing
GATE ONLINE CLASSES

STEPS INVOLVED IN HASHING
 Hashing is implemented in two steps:
 Step 1:
 An element is converted into an integer by using a hash function. This integer value
can be used to calculate the index which is used to store the original element
 hash = hashfunc(key)
 Step 2:
 The element is stored in the hash table where it can be quickly retrieved using
hashed key
 index = hash % array_size
GATE ONLINE CLASSES

Components of Hashing
GATE ONLINE CLASSES

COMPONENTS OF HASHING
 Hashing has four key components

 Hash Functions
 Hash Table
 Collisions
 Collision Resolution Techniques
GATE ONLINE CLASSES

Hash Functions
GATE ONLINE CLASSES

HASH FUNCTION
 Hash Function
 The hash function is used to transform the key into the index
 A hash function should map each possible key to a unique slot index (Which
is impossible in real practice)
 The values returned by a hash function are called hash values, hash codes,
hash sums, or simply hashes
GATE ONLINE CLASSES

Requirements for Good Hash
Function
GATE ONLINE CLASSES

REQUIREMENTS OF GOOD HASH FUNCTION
 Basic Requirements of Good Hash Function:
 Easy to compute:
 It should be easy to compute and must not become an algorithm in itself
 Uniform distribution:
 It should provide a uniform distribution across the hash table and should not result in
clustering
 Less collisions:
 Collisions occur when pairs of elements are mapped to the same hash value. These
should be avoided
GATE ONLINE CLASSES

Types of Hash Functions
GATE ONLINE CLASSES

TYPES OF HASH FUNCTIONS
 Division –
 easiest method to create a hash function
 the first order of business for a hash function is to compute an integer value
 if we expect the hash function to produce a valid index for our chosen table
size, that integer will probably be out of range and that is easily remedied by
modding the integer by the table size
 h(k) = k mod n
 it is better if the table size is a prime, or at least has no small prime factors as
that makes sure the keys are distributed with more uniformity
 k=1501 n=10 h(1501) = 1501 mod 10 = 1
 Disadvantage:
 Consecutive keys map to consecutive hash values – which leads to poor performance
GATE ONLINE CLASSES

TYPES OF HASH FUNCTIONS CONTD…
 Folding –
 begins by dividing the item into equal-size pieces (the last piece
may not be of equal size). These pieces are then added together to
give the resulting hash value
 portions of the key are often recombined, or folded together

 shift folding: 123-45-6789 -> 123 + 456 + 789
 boundary folding: 123-45-6789 -> 123 + 654 + 789
 can be efficiently performed using bitwise operations
GATE ONLINE CLASSES

 Mid-Square Function –
 square the key, then use the middle part as the result
 e.g., 3121 -> 9740641 -> 406 (with a table size of 1000)
 a string would first be transformed into a number using ASCII values
GATE ONLINE CLASSES

 Extraction –
 use only part of the key to compute the result
 The ISBN starting digits are the same for a publisher, so they should be exclude
if the hash table is for only one publisher.
GATE ONLINE CLASSES

 Radix Transformation –
 change the base-of-representation of the numeric key, mod by table size
 Example:
 Key = 345, change to base 9 = 423 % TSize
GATE ONLINE CLASSES

Need for Good Hash Function
GATE ONLINE CLASSES

NEED FOR GOOD HASH FUNCTION
 Assume that you have to store strings in the hash table by using the hashing
technique {“abcdef”, “bcdefa”, “cdefab” , “defabc” }
 Hash Function1: h1 = The index for a specific string will be equal to the sum of
the ASCII values of the characters modulo 599
 The hash function will compute the same index for all the strings
GATE ONLINE CLASSES

 Hash Function2: h2 = The index for a specific string will be equal to sum of ASCII
values of characters multiplied by their respective order in the string after which it
is modulo with 2069 (prime number)
GATE ONLINE CLASSES

 String Hash function Index

abcdef (971 + 982 + 993 + 1004 + 1015 + 1026)%2069 38
bcdefa (981 + 992 + 1003 + 1014 + 1025 + 976)%2069 23
cdefab (991 + 1002 + 1013 + 1024 + 975 + 986)%2069 14
defabc (1001 + 1012 + 1023 + 974 + 985 + 996)%2069 11
GATE ONLINE CLASSES

Hash Tables
GATE ONLINE CLASSES

HASH TABLES
 Hash table
 A hash table is a data structure that is used to store keys/value pairs
 It uses a hash function to compute an index into an array in which an

element will be inserted or searched
GATE ONLINE CLASSES

HASH TABLES
 Provides virtually direct access to objects based on a key (a unique String or

Integer)
 key could be your SID, your telephone number, social security number,
account number, …
 Must have unique keys
 Each key is associated with–mapped to–a value
GATE ONLINE CLASSES

HASH TABLES
 Load Factor of the Hash Table:
 It is the denoted by the symbol λ
 λ = (number of items in the table) / tablesize
 Example: Assume that the tablesize is 10 and it consists of 6 items,

then the load factor of the table is λ = 6/10 = 0.6
GATE ONLINE CLASSES

Collisons
GATE ONLINE CLASSES

COLLISIONS
 A good hash method
 executes quickly
 distributes keys equitably
 Has less collisions
GATE ONLINE CLASSES

COLLISIONS CONTD
 But you still have to handle collisions when two keys have the same hash value
 the hash method is not guaranteed to return a unique integer for each key
 example: simple hash method with "baab" and "abba“
 There are several ways to handle collisions

 Linear Probing
 Separate Chaining
GATE ONLINE CLASSES

Collision Handling Mechanisms
Or
Collision Resolution Techniques
GATE ONLINE CLASSES

COLLISION RESOLUTION
 It is the systematic method for placing the collided item in the

hash table
 Two methods of Collision Resolution

 Open Addressing
 Linear Probing
 Chaining
GATE ONLINE CLASSES

Linear Probing
GATE ONLINE CLASSES

LINEAR PROBING
 Linear Probing:
 search sequentially for an unoccupied position
 uses a wraparound (circular) array
GATE ONLINE CLASSES

A HASH TABLE AFTER THREE INSERTIONS
0
insert objects with ... Keys
these three keys:
80
"abba" 81 "abba"
"abcd"
"abce" 82
83
84
85 "abcd"
86 "abce"
...
308 GATE ONLINE CLASSES
COLLISION OCCURS WHILE INSERTING “baab"
0
can't insert "baab" ...
where it hashes to "baab"
same slot as "abba" 80 Try [81]
81 "abba" Put in [82]
82 "baab"
Linear probe 83
forward by 1,
inserting it
84
at the next 85 "abcd"
available slot
86 "abce"
...
308 GATE ONLINE CLASSES
WRAP AROUND WHEN COLLISION OCCURS AT END
0 "IKLT"
Insert "KLMP" and
"IKLT" ...
both of which havee a 80
hash value of 308
81 "abba"
82 "baab"
83
84
85 "abcd"
86 "abce"
...
308 "KLMP"
GATE ONLINE CLASSES
FIND OBJECT WITH KEY “baab"
0 "IKLT"
...
"baab" still hashes to 80
81, but since [81] is
occupied, linear 81 "abba"
probe to [82] "baab"
82
At this point, you
could return a
83
reference or remove 84
baab
85 "abcd"
86 "abce"
...
308 "KLMP"
GATE ONLINE CLASSES
LINEAR PROBING HAS CLUSTERING PROBLEM
 Used slots tend to cluster with linear probing
Black areas represent slots in use; white areas are empty slots
GATE ONLINE CLASSES

Quadratic Probing
GATE ONLINE CLASSES

QUADRATIC PROBING
 Quadratic probing eliminates the primary clustering problem
 Assume hVal is the value of the hash function
 Instead of linear probing which searches for an open slot in a linear fashion
like this
hVal + 1, hVal + 2, hVal + 3, hVal + 4, ...
 add index values in increments of powers of 2

hVal + 1*1, hVal + 2*2, hVal + 3*3, hVal + 4*4, ...
GATE ONLINE CLASSES

DOES IT WORK?
 Quadratic probing works well if

 Table size is prime
 studies show the prime numbered table size removes some of the non-
randomness of hash functions
GATE ONLINE CLASSES

Separate Chaining
GATE ONLINE CLASSES

SEPARATE CHAINING
 Separate Chaining is an alternative to probing
 How?
 Maintain an array of lists
 Hash to the same place always and insert at the beginning (or end) of the linked list.
GATE ONLINE CLASSES

ARRAY OF LINKEDLISTS DATA STRUCTURE
 Each array element is a List
0
1 “AB” 9 “BA” 9
GATE ONLINE CLASSES

SEPARATE CHAINING
GATE ONLINE CLASSES

HASHING SUMMARY
 Hashing involves transforming a key to produce an integer in a fixed range

(0..TABLE_SIZE-1)
 The function that transforms the key into an array index is known as the hash function
 When two data values produce the same hash value, you get a collision
 Collision resolution may be done by searching for the next open slot at or after the
position given by the hash function, wrapping around to the front of the table when
you run off the end (known as linear probing)
GATE ONLINE CLASSES

HASHING SUMMARY
 Another common collision resolution technique is to store the table as an array of

linked lists and to keep at each array index the list of values that yield that hash
value known as separate chaining
 Most often the data stored in a hash table includes both a key field and a data field
(e.g., social security number and student information).
 The key field determines where to store the value
 A lookup on that key will then return the value associated with that key (if it is
mapped in the table)
GATE ONLINE CLASSES

GATE ONLINE CLASSES
End of
Day – 10 Lecture Notes
on
DATA STRUCTURES
GATE ONLINE CLASSES

Online Classes: Data Structures

Uploaded by

Copyright:

Available Formats

Online Classes: Data Structures

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Online Classes: Data Structures

Uploaded by

Copyright:

Available Formats

GATE ONLINE CLASSES

GATE ONLINE CLASSES

GATE ONLINE CLASSES

GATE ONLINE CLASSES

GATE ONLINE CLASSES

GATE ONLINE CLASSES

 Let G=(V,E) be a graph with n vertices.

 The adjacency matrix of G is a two-dimensional n by n array, say adj_mat

 If the edge (vi, vj) is in E(G), adj_mat[i][j]=1

 If there is no such edge in E(G), adj_mat[i][j]=0

 The adjacency matrix for an undirected graph is symmetric

 The adjacency matrix for a digraph need not be symmetric

GATE ONLINE CLASSES

 From the adjacency matrix, to determine the connection of vertices is

 For undirected graph, the degree of a vertex is sum of columns

GATE ONLINE CLASSES

 Storage complexity: O(|V|2)

 Undirected graph: symmetric along main diagonal

 Very simple, good for small graphs

 Many graphs in practical problems are sparse

 Matrix representation demands too much memory

 We want to reduce memory footprint

GATE ONLINE CLASSES

GATE ONLINE CLASSES

degreeof a vertex in an undirected graph

out-degreeof a vertex in a directed graph

in-degreeof a vertex in a directed graph

GATE ONLINE CLASSES

GATE ONLINE CLASSES

GATE ONLINE CLASSES

 Two types of Graph Traversal Techniques

GATE ONLINE CLASSES

GATE ONLINE CLASSES

 One of the simplest algorithms

 Also one of the most important

GATE ONLINE CLASSES

 Given a starting vertex s

 Visit all vertices at increasing distance from s

GATE ONLINE CLASSES

GATE ONLINE CLASSES

1. unmark all vertices in G

Start with A. Put in the queue (marked red)

GATE ONLINE CLASSES

When we go to A, we put B and E in the queue

GATE ONLINE CLASSES

When we go to B, we put G and C in the queue

GATE ONLINE CLASSES

When we go to E, we put D and F in the queue

GATE ONLINE CLASSES

GATE ONLINE CLASSES

GATE ONLINE CLASSES

GATE ONLINE CLASSES

GATE ONLINE CLASSES

 Complexity: O(|V| + |E|)

 BFS finds shortest path from s to each vertex

GATE ONLINE CLASSES

GATE ONLINE CLASSES

 Again, a simple and powerful algorithm

GATE ONLINE CLASSES