B+ Trees

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

8

B+ Trees
nC
previous chapter, B-Tree was used to organize index
ch
In the sets. The B-Tree
and leletion as easy as structure
insertion
arching, but, B-Tree has certain
makes
c there will be a need to access data both shortcomings.
most e to data stored in files can be
sequentially and randomly. The
accomplished through B-Tree. But, it
rando

when data need to be


e accessed sequentially. B-Tree cannot access
performs
poorly a
group of records
together.

8.1 Sequential Access to Data


One way to access all records in a tile is to retrieve one record after other. Since this
is a time consuming process, a block structure is
designed for
sequential order. maintaining the index list in

8.1.1 Block Structure of Index List


Consider accessing all records of the index list
access performs poorly, blocks of index
given in Figure5.7. Since
sequential
can be maintained.
The block set of index list
appears as shown in Figure 8.1.

IMS04Is073 IMS04IS088| IMS041S091| IMS04IS047

IMSO6ISO01 IMS06IS013 1MSO6IS038 IMS07ISO01

IMSO7ISO10 IMS07IS045
Figure 8.1 Blocks for index list
The
keys design in Figure 8.1 assumes that block can contain four a
nablec the nged in the sorted order and the blocks are
enables keys. In the blocks, the
chained. The chaining of blocks
the address is quential access of recor It is observed that while
The pro also stored along with storing keys in blocks,
vith keys. But that is not shown in
Figure 8.1.
maiDeratntaioinnsinglike maintaining
the
les of
B-Tree.
a
the block sets (also called as sequence set) is similar to
These blocks
the nodes of B-Tree. also undergo split, merge and redisS
210 File Structures

Example:
Consider inserting key IMS04IS050 into the sequence set. The first block
block is
is
full.Therefore, it split
candidate for insertion. But, it is already and keys ar the
between new and old blocks. Now that there is space in the first block
the key 1s tributrigehj
redisinserte
IMS04IS050| IMS041SO73 IMS041S088

IMS04IS091 IMS051S047

IMS06IS001 IMSO6IS013 IMS06IS038 IMS07IS001

IMSO7IS010 1MS07IS045
Figure 8.2 Inserting into sequence set
Procedure for deleting an element from the secquence set is same as the deletion Drocedh
re
followed in B-Tree and after deleting a node, the merge and redistribution functions s are
also applied.

Adding index to sequence set


We have discussed the method to maintain sequence set and now the question is to how to
access records sequentially. The sequence set performs very badly if the data is randon
accessed from the file. Maintaining an index is the best way to enable random access
data. So, building an index for sequence set is proposed. Ideally, an index acts as a separaltur
between list of clements. In sense, index is like signboard and it directstheseard
two
query to its exact route.
a a

If the index is understood as a separator, then many possibilities emerge. One interesne
inference is that the index element need not be a candidate of either of two lists
ov
which it operates. Consider the following two blocks from Figure 8.2.

BI IMS04ISO50 IMS04IS073 IMS04IS088 Possible Separators


IMS04IS091
IMS04IS100.

B2 IMS04IS091 IMSOSIS047
Figure 8.3 Separators as index
Between the two blocks BI and B2, there is paralon

IMS04IS091 can be the ideal index element


possibility having maalso be
a of
but,
a number like
IMS04ISI00 thid

separator. Consider the sequence sets given in Figure 8.2. Between the normaly

blocks, even IMS06 can be a separator in addition to IMSO6ISO01


whic
considered as a separator. The separators can be again blocked into Index
d
8.2 Simple Prefix B+ Trees B+ Tree 211
s section explored the
T h ep r e v i o u

of
sequence
set. If the index is
chosen in suchhaving
havingpossibility of
a
between two nsecutive a variable size
way that
Figure 8.4 illustrates suchsequence they are theindex set for a
separator

B+ Trees. Figa sets, the str


a tree structure is called minimal group
An Ashw Bi as
simplepossible
prefix

Adarsh..Amit Ar .Asha Ashwin...Babu Bindu.


du..
Figure 8.4 Simple Prefix B+
Tress
The ree Figure 8.4 1S
interesting in Several aspects.
for four
elements for sequence set blocks. Second, the First, there are only three index
other. sequence set are linked to each

8.3 B+ Trees
Trees where the minimal separator is not
considered as the clement for index is
ac B+ Tree. Normally B+ Iree Is also characterized by its order. But since B+ called
Tree consists
af a sequence set and an index set, it 1s ident1fied by two orders. Let us assume that
the
arder of sequence set is n and order of index set is m. The properties of B+ Tree are as

follows.

Properties of sequence set


sequence set has a minimum of n/2 elements and maximum of n
1) Each node in
elements.

Properties of index set: maximum ot m


elements and
minimum of m/2
in index set has
) Each node an

elements. child nodes.


the node has (i+1) element (2 child nodes)
() lt a node has i elements, then minimum of one
a
always have
(i) nc root node shall

easily Theprocess.
8.3.1 Creation of B+ Tree
c o n s t r u c t e d

can
be an

f the available, B+ Treeand then developing

item at
a
time. Thepro
index set is already index list one

from the
reading
by
creating seq has
be
built
(Figure
8.5a).

set.But in most cases, B+ Tree below.


it is
full

it till
explained

Tree is inte
i nto
involved to create a B+ Push
nts
elements

node.
set
(0) Create a sequence
212 File Structures

Figure 8.5 (a) Creation of B+ Tree

(i)Next if e is inserted, the sequence set node has to be


split. Once
ce itit s
create an index node. The index node is created as
a) Promote the lowest element from each
follows: splits, we have io
node of the
sequence set
in the index set. its parent
b) In the index set
node, make the first element invisible. node
c) This invisible element is considered for branching, split and
considered as part of that node (Figure 8.5b). merge
de. D
But it iS not

Figure 8.5 (b) Creation of B+ Tree


The invisible element in the index set node is
identified by a dotted circle
it. Now the property of B+ Tree is written around
satisfied (one element in index set ar

nodes in the sequence set). The formal node has twae


below: algorithm to insert a key into a B+ Tree 0 child
is pives
gven
Let,
N.the number of elements in a node
Nthe number of separators in a index node.
m- the order of B+ Tree
Insert
(key)-function that inserts the key in an
appropriate position
split (node) - function that splits anode into two nodes.
Lowest
(node)-function that returns the lowest element of a
node.
Promote (Lowest(node)) - function that promotes the lowest
element of a node to ts
corresponding index node.
Stepl: Start from Root
If (key <
separator)
Go left
else if (key = =
separator)
go right
else if (key>
separator)
go right
Step2: If found node
B+ Tree 213

If
(Ne m )
f (key
Insert (key)
Goto step 4
= m) u
Else if (N
Split (Node)

Insert (key)
<m)
Step3 I (N,
Promote (Lowest (Node))

Else if (N,
= =
m)
NewNode
= CreateNode (Node)
Promote (Lowest(NewNode))

Repeat Step 3 for all higher level indexes till the Root is reached.

Step4: Exit

() Splitting
Tis a process where a parent
node is split nto two child nodes (Figure 8.6a) by evenl

distributing the elements of the parent among its (two) children (Figure 8.6b).

AANELLI
E EG
Figure 8.6 (a) Splitting a Node

Now, to insert IP; node (2) is split into two nodes.

Node
Figure 8.6 (b) Splitting a
6.3.2
Deletion from B+ Tree
Algorithm
Let,
to delete akey from aB+ Tree is as follows:
N be the number of elements in a node
N in a node

min be the minimum number of elements (to be present)


dicate the number of elements in the sibling node
214 File Structures

elements of a node,
Node represent the
represent the elements ofthe sibling.
Node+1 - a function that deletes the key from the noda
Delete(key)
Lowest (Node) a function that returns the lowest element t
of the node
Delete-small (key) - a function that deletes the lowest element.
ot a node
and makes the required changes in the index
Merge (Node,, Node,1) a function that merge the elements of a nod.de
elements of the sibling making required change the
nodes index
Redistribute (Node,, Node,,1)-a function that redistributes the element
elements ot a
and its sibling making require nod
ode
changes index
in
nodes.
Stepl: Start from Root
If (key < separator)
Go left
else it (key = = separator

go right
else if (key >separator)
go right
Step2: If found Node,
If(N> Nmin and key =lowest (Node))
Delete (key)
Goto Step3
Else if (N.> Nmin and key = = lowest (Node))

Delete-small (key)
Goto Step3
Else if (N = =
Nmin and N+1= Nmin =

Delete (key)
Merge (Node; , Node,+ 1)
Goto Step 3
Else if (N.= =
Nmin and N,1> Nmin
Delete (key)
Redistribute (Node,, Node1)
Goto Step3
Step3: EXIT

0) Merging
t is a process where the elements of a node are merged with its sibling proviu
sibling has atleast one element less than the order of the sequence set.
B+ Tree 215
F o re x a m p l e :

(1) (2)
(3)
A BCD-TE
Figure 8.7 (a) Merging of Nodes
In the B+ shown in Figure 8.7a, if the element F is
making the required changes in the root as deleted, node (2) is
with
de (3) shown in Figure 8.7b. merged

Figure 8.7 (b) Merging of Nodes

(i) Redistribution

elements of a node with its sibling. For example, consider


It is a
process of rearranging the
shown in Figure 8.8a, it the element F is deleted, the elements ofnodes (2)
he B+ Tree
as
Redistributed (Figure 8.8b).
and (3) are

AE
(2) (3)
(1)

Redistribution of Keys
Figure 8.8 (a)

Redistribution
of Keys
8.8 (b)
Figure
216 File Structures

The complete example of creating B+ Tree is illustrated in Figure Rq


Input::Q. W, E, R, T, Y, U, I, 0, P, A, S, D, F, G, H, J, K, L, Z,
z, X, CC, vV,
B, N, M
a)
E QR
Level 1

b)
Inserting T

E O
c)
ERLD Inserting YY

RTWY
d)
ER W Inserting U

e)
Inserting I

EC R w Inserting O

DENO R N Inserting P

E -O P MIU
TH
B+ Tree 217

Inserting A

RTUE

AO RN Inserting S

AL E
R SD 0

LAo R Inserting D

A DE TO|P |OHM YIII


Level 2 RST|U

k)
AO D Inserting FF

oPQh Y|U

LR ST0
218 File Structures

CANOLLL Inserting G

R IS Tu

m)
CIAHOI Inserting KK

n)
Inserting L

A DTTTI
B Tree 219

OAO LO Inserting Z

Inserting X

Inserting CC

AME ICO
File S t r u c t u r e s
220

Inserting V
ANOLLLD

Inserting B
CANOILD

A B IC D-TE ET|-19 P 9-R|S|

CAN O Inserting N
B+ Tree 221

Inserting M

TU-LD -ADMEMDUT

Final B+Tree (without invisible elements)

| LKLIIII RO MT
E G
CD-TE BT
A E

B-+ Tree
Figure 8.9 Building

Implementation
of B+ Tree
8.4 C++
IO Buffer class
an of
using file are
managed
8.4.1 Problem Statement They are
fields in the
data file. The constructed

stored in a is fixed. is
The Student file B' Tree
1n
records are

record in the
data The
The
limiters.
delimiters.
B
atelyseparalels

hierarchy. The size of each record using maintained

packed
into are
Set
variable length. They re
Set and
Index

LSing USN as key. The Sequence

You might also like