FULL UNIT 5 CSE 205

Download as pdf or txt
Download as pdf or txt
You are on page 1of 71

➢Introduction

➢Insertion
➢Deletion
Balance Binary Search Tree
 Worst case height of binary search tree: N
 Insertion, deletion can be O(N) in the worst case
 We want a tree with small height
 Height of a binary tree with N node is at least (log N)
 Goal: keep the height of a binary search tree O(log N)
 Balanced binary search trees
 Examples: AVL tree, red-black tree
Balanced Tree?
 Suggestion 1: the left and right subtrees of root have the
same height
 Suggestion 2: every node must have left and right
subtrees of the same height
 Our choice: for each node, the height of the left and
right subtrees can differ at most 1,-1,0.
AVL Tree
 An AVL (Adelson-Velskii and Landis 1962) tree is a
binary search tree in which
 for every node in the tree, the height of the left and
right subtrees differ by at most 1.

AVL tree AVL property


violated here
AVL Tree with
Minimum Number of Nodes

N1 = 1 N2 = 2 N3 =4 N4 = N2+N3+1=7
Height of AVL Tree
 Denote Nh the minimum number of nodes in an AVL tree
of height h

 N1=1, N2 =2 (base)
Nh= Nh-1 + Nh-2 +1 (recursive relation)

 many operations (i.e. searching) on an AVL tree will take


O(log N) time
Insertion in AVL Tree
 Basically follows insertion strategy of binary search
tree
 But may cause violation of AVL tree property
 Restore the destroyed balance condition if needed

6 8

6
Insert 6
Original AVL tree Restore AVL property
Property violated
Some Observations
 After an insertion, only nodes that are on the path from
the insertion point to the root might have their balance
altered
 Because only those nodes have their subtrees altered
 Rebalance the tree at the deepest such node guarantees
that the entire tree satisfies the AVL property

6 8

6
Node 5,8,7 might Rebalance node 7
have balance altered guarantees the whole tree be AVL
Different Rebalancing Rotations
The rebalancing rotations are classified as LL, LR, RR
and RL as illustrated below, based on the position of the
inserted node with reference to α.
LL rotation: Inserted node is in the left subtree of the left
subtree of node α
RR rotation: Inserted node is in the right subtree of the
right subtree of node α
LR rotation: Inserted node is in the right subtree of the
left subtree of node α
RL rotation: Inserted node is in the left subtree of the
right subtree of node α
Insertion Analysis
 Insert the new key as a new leaf just as in ordinary
binary search tree: O(logN)
 Then trace the path from the new leaf towards the
root, for each node x encountered: O(logN)
 Check height difference: O(1)
 If satisfies AVL property, proceed to next node: O(1)
 If not, perform a rotation: O(1)
 The insertion stops when
 A single rotation is performed logN
 Or, we’ve checked all nodes in the path
 Time complexity for insertion O(logN)
Deletion from AVL Tree
 Basically follows deletion strategy of binary search tree
 But may cause violation of AVL tree property
 Restore the destroyed balance condition if needed

20 20

10 35 15 35

5 15 25 40 10 18 25 40

18 30 38 45 30 38 45

50 50

Delete 5, Node 10 is unbalanced Single Rotation


Cont’d
20 35

15 35 20 40

10 18 25 40 15 25 38 45

30 38 45 10 18 30 50

Continue to check parents 50


Single Rotation
Oops!! Node 20 is unbalanced!!

For deletion, after rotation, we need to continue tracing


upward to see if AVL-tree property is violated at other node.
Different rotations are classified in R0, R1, R-1, L0, L1 And L-1
Different rotations
• An element can be deleted from AVL tree which
may change the BF of a node such that it results in
unbalanced tree.
• Some rotations will be applied on AVL tree to
balance it.
• R rotation is applied if the deleted node is in the
right subtree of node A (A is the node with
balance factor other than 0, 1 and -1).
• L rotation is applied if the deleted node is in the
left subtree of node A.
Different rotations cont…
• Suppose we have deleted node X from the tree.

• A is the closest ancestor node on the path from X


to the root node, with a balance factor -2 or +2.

• B is the desendent of node A on the opposite


subtree of deleted node i.e. if the deleted node is
on left side the B is the desendent on the right
subtree of A or the root of right subtree of A.
R Rotation
• R Rotation is applied when the deleted node is in the right
subtree of node A.
• There are three different types of rotations based on the
balanced factor of node B.
• R0 Rotation: When the balance Factor of node B is 0.
• Apply LL Rotation on node A.
• R1 Rotation: When the balance Factor of node B is +1.
• Apply LL Rotation on node A.
• R-1 Rotation: When the balance Factor of node B is -1.
• Apply LR Rotation(RR rotation on B and LL rotation on
node A).
L Rotation
• L Rotation is applied when the deleted node is in the
left subtree of node A.
• There are three different types of rotations based on the
balanced factor of node B.
• L0 Rotation: When the balance Factor of node B is 0.
• Apply RR Rotation on node A.
• L-1 Rotation: When the balance Factor of node B is -1.
• Apply RR Rotation on node A.
• L1 Rotation: When the balance Factor of node B is +1.
• Apply RL Rotation(LL rotation on B and RR
rotation on node A).
Encoding and Compression of Data
 Fax Machines
 ASCII
 Variations on ASCII
 min number of bits needed
 cost of savings
 patterns
 modifications
Purpose of Huffman Coding
 Proposed by Dr. David A. Huffman in 1952
 “A Method for the Construction of Minimum
Redundancy Codes”
 Applicable to many forms of data transmission
 Our example: text files
The Basic Algorithm
 Huffman coding is a form of statistical coding
 Not all characters occur with the same frequency!
 Yet all characters are allocated the same amount of space
 1 char = 1 byte, be it e or x
 Any savings in tailoring codes to frequency of character?
 Code word lengths are no longer fixed like ASCII.
 Code word lengths vary and will be shorter for the more
frequently used characters.
The (Real) Basic Algorithm
1. Scan text to be compressed and tally occurrence of
all characters.
2. Sort or prioritize characters based on number of
occurrences in text.
3. Build Huffman code tree based on prioritized list.
4. Perform a traversal of tree to determine all code
words.
5. Scan text again and create new file using the
Huffman codes.
Building a Tree
Scan the original text
 Consider the following short text:

Eerie eyes seen near lake.

 Count up the occurrences of all characters in the text


Building a Tree
Scan the original text
Eerie eyes seen near lake.
 What characters are present?

E e r i space
y s n a r l k .
Building a Tree
Scan the original text
Eerie eyes seen near lake.
 What is the frequency of each character in the text?

Char Freq. Char Freq. Char Freq.


E 1 y 1 k 1
e 8 s 2 . 1
r 2 n 2
i 1 a 2
space 4 l 1
Building a Tree
Prioritize characters

 Create binary tree nodes with character and


frequency of each character
 Place nodes in a priority queue
 The lower the occurrence, the higher the priority in
the queue

•CS 102
Building a Tree
Prioritize characters
 Uses binary tree nodes
public class HuffNode
{
public char myChar;
public int myFrequency;
public HuffNode myLeft, myRight;
}
 priorityQueue myQueue;
Building a Tree
 The queue after inserting all nodes

E i y l k . r s n a sp e
1 1 1 1 1 1 2 2 2 2 4 8

 Null Pointers are not shown

•CS 102
Building a Tree
 While priority queue contains two or more nodes
 Create new node
 Dequeue node and make it left subtree
 Dequeue next node and make it right subtree
 Frequency of new node equals sum of frequency of
left and right children
 Enqueue new node back into queue
Building a Tree
E i y l k . r s n a sp e
1 1 1 1 1 1 2 2 2 2 4 8
Building a Tree
y l k . r s n a sp e
1 1 1 1 2 2 2 2 4 8

E i
1 1
Building a Tree
y l k . r s n a sp e
2
1 1 1 1 2 2 2 2 4 8
E i
1 1
Building a Tree
k . r s n a sp e
2
1 1 2 2 2 2 4 8
E i
1 1

y l
1 1
Building a Tree
2
k . r s n a 2 sp e
1 1 2 2 2 2 4 8
y l
E i 1 1
1 1
Building a Tree
r s n a 2 2 sp e
2 2 2 2 4 8
y l
E i 1 1
1 1

k .
1 1
Building a Tree
r s n a 2 2 sp e
2
2 2 2 2 4 8
E i y l k .
1 1 1 1 1 1
Building a Tree
n a 2 sp e
2 2
2 2 4 8
E i y l k .
1 1 1 1 1 1

r s
2 2
Building a Tree
n a 2 sp e
2 2 4
2 2 4 8

E i y l k . r s
1 1 1 1 1 1 2 2
Building a Tree
2 4 e
2 2 sp
8
4
y l k . r s
E i 1 1 1 1 2 2
1 1

n a
2 2
Building a Tree
2 4 4 e
2 2 sp
8
4
y l k . r s n a
E i 1 1 1 1 2 2 2 2
1 1
Building a Tree
4 4 e
2 sp
8
4
k . r s n a
1 1 2 2 2 2

2 2

E i y l
1 1 1 1
Building a Tree
4 4 4
2 sp e
4 2 2 8
k . r s n a
1 1 2 2 2 2
E i y l
1 1 1 1
Building a Tree
4 4 4
e
2 2 8
r s n a
2 2 2 2
E i y l
1 1 1 1

2 sp
4
k .
1 1
Building a Tree
4 4 4 6 e
2 sp 8
r s n a 2 2
4
2 2 2 2 k .
E i y l 1 1
1 1 1 1

What is happening to the characters


with a low number of occurrences?
Building a Tree
4 6 e
2 2 2 8
sp
4
E i y l k .
1 1 1 1 1 1
8

4 4

r s n a
2 2 2 2
Building a Tree
4 6 e 8
2 2 2 8
sp
4 4 4
E i y l k .
1 1 1 1 1 1
r s n a
2 2 2 2
Building a Tree
8
e
8
4 4
10
r s n a
2 2 2 2 4
6
2 2 2 sp
4
E i y l k .
1 1 1 1 1 1
Building a Tree
8 10
e
8 4
4 4
6
2 2
r s n a 2 sp
2 2 2 2 4
E i y l k .
1 1 1 1 1 1
Building a Tree
10
16
4
6
2 2 e 8
2 sp 8
4
E i y l k . 4 4
1 1 1 1 1 1

r s n a
2 2 2 2
Building a Tree
10 16

4
6
e 8
2 2 8
2 sp
4 4 4
E i y l k .
1 1 1 1 1 1
r s n a
2 2 2 2
Building a Tree
26

16
10

4 e 8
6 8
2 2 2 sp 4 4
4
E i y l k .
1 1 1 1 1 1 r s n a
2 2 2 2
Building a Tree •After
enqueueing
26 this node
there is only
16
10 one node left
4 e 8
in priority
6 8 queue.
2 2 2 sp 4 4
4
E i y l k .
1 1 1 1 1 1 r s n a
2 2 2 2
Building a Tree
Dequeue the single node
left in the queue.
26

16
This tree contains the 10
new code words for each
4 e 8
character. 6 8
2 2 2 sp 4 4
4
Frequency of root node E i y l k .
1 1 1 1 1 1 r s n a
should equal number of 2 2 2 2
characters in text.
Eerie eyes seen near lake. 26 characters
Encoding the File
Traverse Tree for Codes
 Perform a traversal of the
tree to obtain new code
words 26

 Going left is a 0 going right 16


10
is a 1
 code word is only 4 e 8
6 8
completed when a leaf 2 2 2 sp 4 4
node is reached 4
E i y l k .
1 1 1 1 1 1 r s n a
2 2 2 2
Encoding the File
Traverse Tree for Codes
Char Code
E 0000
i 0001 26
y 0010
l 0011 10
16
k 0100
. 0101 4 e 8
space 011 6 8
e 10 2 2 2 sp 4 4
r 1100 4
s 1101 E i y l k .
1 1 1 1 1 1 r s n a
n 1110 2 2 2 2
a 1111
Encoding the File
 Rescan text and encode file
using new code words Char Code
E 0000
Eerie eyes seen near lake.
i 0001
y 0010
0000101100000110011 l 0011
1000101011011010011 k 0100
1110101111110001100 . 0101
space 011
1111110100100101 e 10
r 1100
• Why is there no need s 1101
for a separator n 1110
character? a 1111
Encoding the File
Results
 Have we made things any 0000101100000110011
better?
1000101011011010011
 73 bits to encode the text 1110101111110001100
 ASCII would take 8 * 26 = 1111110100100101
208 bits

If modified code used 4 bits per


character are needed. Total bits
4 * 26 = 104. Savings not as great.
Decoding the File
 How does receiver know what the codes are?
 Tree constructed for each text file.
 Considers frequency for each file
 Big hit on compression, especially for smaller files
 Tree predetermined
 based on statistical analysis of text files or file types
 Data transmission is bit based versus byte based
Decoding the File
 Once receiver has tree it
scans incoming bit stream 26

 0  go left 16
10
 1  go right
4 e 8
6 8
0000101100000110011 2 2 2 sp 4 4
1000101011011010011 4
E i y l k .
1110101111110001100 1 1 1 1 1 1 r s n a
2 2 2 2
1111110100100101
Summary
 Huffman coding is a technique used to compress
files for transmission
 Uses statistical coding
 more frequently used symbols have shorter code words
 Works well for text and fax transmissions
 An application that uses several data structures
Thank You
➢insertion,
➢deletion,
➢sorting and
➢complexity analysis
Heap
❑ A max tree is a tree in which the key value in
each node is no smaller than the key values in
its children. A max heap is a complete binary
tree that is also a max tree.
❑ A min tree is a tree in which the key value in
each node is no larger than the key values in
its children. A min heap is a complete binary
tree that is also a min tree.
❑ Operations on heaps
❑ creation of an empty heap
❑ insertion of a new element into the heap;
❑ deletion of the largest element from the heap
2
Figure : Sample max heaps

[1] [1] [1]


14 9 30

[2] [3] [2] [3] [2]


12 7 6 3 25
[4] [5] [6] [4]
10 8 6 5

Property:
The root of max heap (min heap) contains
the largest (smallest).
Figure : Sample min heaps

[1] [1] [1]


2 10 11

[2] [3] [2] [3] [2]


7 4 20 83 21
[4] [5] [6] [4]
10 8 6 50
Example of Insertion to Max Heap

20 20 21

15 2 15 5 15 20

14 10 14 10 2 14 10 2

initial location of new node insert 5 into heap insert 21 into heap

5
Insertion into a Max Heap
INSHEAP(TREE, N, ITEM)
1. Set N=N+1 and PTR=N.
2. Repeat Steps 3 to 6 while PTR>1.
3. Set PAR= └PTR/2┘
4. If ITEM<= TREE[PAR], then:
Set TREE[PTR]=ITEM, and Return.
[End of If structure]
5. Set TREE[PTR]=TREE[PAR]
6. Set PTR=PAR.
[End of step 2]
7. Set TREE[1]=ITEM
8. Return.
Example of Deletion from Max Heap

remove
20 10 15

15 2 15 2 14 2

14 10 14 10

7
Deletion from a Max Heap
DELHEAP(TREE, N, ITEM)
1. Set ITEM= TREE[1].
2. Set LAST= TREE[N] and N= N-1.
3. Set PTR=1, LEFT=2 and RIGHT=3.
4. Repeat Steps 5 to 7 while RIGHT<=N:
5. If LAST>=TREE[LEFT] and LAST>= TREE[RIGHT], then:
Set TREE[PTR]=LAST and Return.
[End of If structure.]
6. If TREE[RIGHT]<= TREE[LEFT], then:
Set TREE[PTR]= TREE[LEFT] and PTR=LEFT.
Else:
Set TREE[PTR]= TREE[RIGHT] and PTR=RIGHT.
[End of If structure.]
7. Set LEFT=2*PTR and RIGHT=LEFT+1.
[End of Step 4 loop.]
8. If LEFT=N and if LAST < TREE[LEFT] set TREE[PTR]= TREE[LEFT]
Set PTR=LEFT.
9. Set TREE[PTR]= LAST.
10. Return.
HeapSort
HEAPSORT(A, N)
1. Repeat for J=1 to N-1:
Call INSHEAP(A, J, A[J+1]).
[End of loop.]
2. Repeat while N>1:
a) Call DELHEAP(A, N, ITEM).
b) A[N+1]=ITEM.
[End of loop.]
3. Exit.
Thank You

You might also like