Online Classes: Data Structures
Online Classes: Data Structures
Online Classes: Data Structures
ON
DATA STRUCTURES
Presented by
Chandra Sekhar Sanaboina
Assistant Professor
Department of CSE
University College of Engineering Kakinada
Jawaharlal Nehru Technological University Kakinada
Adjacency Matrix
Adjacency Lists
0 4
0
0 2 1 5
1 2 1 3 6
3 7
G2
G 2 0 1 1 0 0 0 0 0
1 1 0
0 1 0 0 1 0 0 0
1 1 0 1 0 1 0 0 1 0 0 0 0
1 0 1 1
0 1 1 0 0 0 0
0
1 0 1 0 0 0 0 0 1 0 0
1 1 0 1 0 0 0 0 0 0 0 1 0 1 0
0
0 0 0 0 1 0 1
1 1 1 0 Symmetric
0 0 0 0 0 0 1 0
GATE ONLINE CLASSES
G4
Merits of Adjacency Matrix
For a digraph, the row sum is the out-degree, while the column sum
is the in-degree n 1
n 1 outd ( vi ) A[i , j ]
ind (vi ) A[ j , i ] j 0
j 0
Directed Graph:
In-degree of X: Sum along column X O(|V|)
Out-degree of X: Sum along row X O(|V|)
An undirected graph with n vertices and e edges ==> n head nodes and 2e list nodes
INTERESTING OPERATIONS
Storage Complexity:
O(|V| + |E|)
In undirected graph: O(|V|+2*|E|) = O(|V|+|E|)
Degree of node X:
Out degree: Length of Adj[X] O(|V|) calculation
In degree: Check all Adj[] lists O(|V|+|E|)
A traversal (search):
An algorithm for systematically exploring a graph
Visiting (all) vertices
Until finding a goal vertex or until no more vertices
2 8
1 3 6 10
5 2 8 1 3 6 10 7 9
7 9
GATE ONLINE CLASSES
ALGORITHM FOR BFS USING QUEUES
G C D
Queue: A F
G C D
Queue: A B E
F
G C D
Queue: A B E C G
F
G C D
Queue: A B E C G D F
F
G C D
Queue: A B E C G D F
F
G C D
Queue: A B E C G D F F
G C D
Queue: A B E C G D F F
G C D
Queue: A B E C G D F F
G C D
G C D
Stack: B E
Output: A Next Step: F
Pick E (Top)
Pop E and place it in output by pushing adjacent vertices of E (D, F) on
to the stack
G C D
Stack: B D F
Output: AE Next Step: F
Pick F (Top)
Pop F and place it in output by pushing the unvisited adjacent vertices
of F (NIL) on to the stack
G C D
Stack: B D
Output: AEF Next Step: F
Pick D (Top)
Pop D and place it in output by pushing the unvisited adjacent vertices
of D (NIL) on to the stack
G C D
Stack: B
Output: AEFD Next Step: F
Pick B (Top)
Pop B and place it in output by pushing the unvisited adjacent vertices
of B (C, G) on to the stack
G C D
Stack: CG
Output: AEFDB Next Step: F
Pick G (Top)
Pop G and place it in output by pushing the unvisited adjacent vertices
of G (NIL) on to the stack
G C D
Stack: C
Output: AEFDBG Next Step:
F
Pick C (Top)
Pop C and place it in output by pushing the unvisited adjacent vertices
of C (NIL) on to the stack
G C D
Stack: EMPTY
Output: AEFDBGC
F
Next Step:
Stack Empty -> No elements to process -> Final Outpu
Breadth First Search: v0, v1, v2, v3, v4, v5, v6, v7
Hash Tables
Collisions
Collision Resolution Techniques
Linear Probing
Quadratic Probing
Separate Chaining
Hashing
Hashing is a technique used for storing and retrieving information as quickly as
possible.
To store the key/value pair, you can use a simple array like a data structure where
keys (integers) can be used directly as an index to store values
Keys are large and cannot be used directly as an index you should use hashing
GATE ONLINE CLASSES
Steps Involved in Hasing
Step 1:
An element is converted into an integer by using a hash function. This integer value
can be used to calculate the index which is used to store the original element
hash = hashfunc(key)
Step 2:
The element is stored in the hash table where it can be quickly retrieved using
hashed key
index = hash % array_size
Hash Function
The hash function is used to transform the key into the index
A hash function should map each possible key to a unique slot index (Which
is impossible in real practice)
The values returned by a hash function are called hash values, hash codes,
hash sums, or simply hashes
Easy to compute:
It should be easy to compute and must not become an algorithm in itself
Uniform distribution:
It should provide a uniform distribution across the hash table and should not result in
clustering
Less collisions:
Collisions occur when pairs of elements are mapped to the same hash value. These
should be avoided
Division –
easiest method to create a hash function
the first order of business for a hash function is to compute an integer value
if we expect the hash function to produce a valid index for our chosen table
size, that integer will probably be out of range and that is easily remedied by
modding the integer by the table size
h(k) = k mod n
it is better if the table size is a prime, or at least has no small prime factors as
that makes sure the keys are distributed with more uniformity
k=1501 n=10 h(1501) = 1501 mod 10 = 1
Disadvantage:
Consecutive keys map to consecutive hash values – which leads to poor performance
Folding –
begins by dividing the item into equal-size pieces (the last piece
may not be of equal size). These pieces are then added together to
give the resulting hash value
Mid-Square Function –
square the key, then use the middle part as the result
e.g., 3121 -> 9740641 -> 406 (with a table size of 1000)
Extraction –
The ISBN starting digits are the same for a publisher, so they should be exclude
if the hash table is for only one publisher.
Radix Transformation –
Example:
Key = 345, change to base 9 = 423 % TSize
Assume that you have to store strings in the hash table by using the hashing
technique {“abcdef”, “bcdefa”, “cdefab” , “defabc” }
Hash Function1: h1 = The index for a specific string will be equal to the sum of
the ASCII values of the characters modulo 599
The hash function will compute the same index for all the strings
Hash Function2: h2 = The index for a specific string will be equal to sum of ASCII
values of characters multiplied by their respective order in the string after which it
is modulo with 2069 (prime number)
Hash table
key could be your SID, your telephone number, social security number,
account number, …
executes quickly
But you still have to handle collisions when two keys have the same hash value
the hash method is not guaranteed to return a unique integer for each key
example: simple hash method with "baab" and "abba“
Linear Probing:
0
insert objects with ... Keys
these three keys:
80
"abba" 81 "abba"
"abcd"
"abce" 82
83
84
85 "abcd"
86 "abce"
...
308 GATE ONLINE CLASSES
COLLISION OCCURS WHILE INSERTING “baab"
0
can't insert "baab" ...
where it hashes to "baab"
same slot as "abba" 80 Try [81]
81 "abba" Put in [82]
82 "baab"
Linear probe 83
forward by 1,
inserting it
84
at the next 85 "abcd"
available slot
86 "abce"
...
308 GATE ONLINE CLASSES
WRAP AROUND WHEN COLLISION OCCURS AT END
0 "IKLT"
Insert "KLMP" and
"IKLT" ...
both of which havee a 80
hash value of 308
81 "abba"
82 "baab"
83
84
85 "abcd"
86 "abce"
...
308 "KLMP"
GATE ONLINE CLASSES
FIND OBJECT WITH KEY “baab"
0 "IKLT"
...
"baab" still hashes to 80
81, but since [81] is
occupied, linear 81 "abba"
probe to [82] "baab"
82
At this point, you
could return a
83
reference or remove 84
baab
85 "abcd"
86 "abce"
...
308 "KLMP"
GATE ONLINE CLASSES
LINEAR PROBING HAS CLUSTERING PROBLEM
Black areas represent slots in use; white areas are empty slots
Instead of linear probing which searches for an open slot in a linear fashion
like this
hVal + 1, hVal + 2, hVal + 3, hVal + 4, ...
How?
Maintain an array of lists
Hash to the same place always and insert at the beginning (or end) of the linked list.
0
1 “AB” 9 “BA” 9
The function that transforms the key into an array index is known as the hash function
When two data values produce the same hash value, you get a collision
Collision resolution may be done by searching for the next open slot at or after the
position given by the hash function, wrapping around to the front of the table when
you run off the end (known as linear probing)
Most often the data stored in a hash table includes both a key field and a data field
(e.g., social security number and student information).
A lookup on that key will then return the value associated with that key (if it is
mapped in the table)