Foundation of Computing Systems
description
Transcript of Foundation of Computing Systems
28.08.09 IT 60101: Lecture #14 1
Foundation of Computing Systems
Lecture 14
B Trees
28.08.09 IT 60101: Lecture #14 2
Indexing Mechanism
• m-way Search
• B-Tree Indexing
• Trie indexing
28.08.09 IT 60101: Lecture #14 3
9 0
2 1 0-
-
-
6 0 0
7 0 0
8 0 0
-
-
1 0
2 0
3 0
5 0
7 0
9 9
11 0
1 2 0
1 5 0
1 9 0
2 2 0
3 0 0
3 3 0
4 0 0
-
5 1 0
5 2 0
5 6 0
5 8 0
-
6 1 0
6 5 0
6 6 5
6 8 5
6 9 0
7 0 1
7 2 5
7 5 0
7 5 5
-
8 8 0
8 8 5-
--
5 0 0
-
-
-
-
R o o t
28.08.09 IT 60101: Lecture #14 4
m-way Search Tree
• Definition
1. An m-way search tree T is a tree in which all nodes are of degree ≤ m
2. Each node in the tree contains the following attributes:
where 1 ≤ n < m Ki (1 ≤ i ≤ n) are key values in the node Pi (0 ≤ i ≤ n) are pointers to the sub-trees of T. Ki < Ki+1, 1 ≤ i < n
3. All the key values in the sub-tree pointed by Pi are less than the key values Ki+1, 0 ≤ i < n.
4. All the key values in the sub-tree pointed by Pn is greater than Kn.
5. All the sub-trees pointed by Pi (0 ≤ i ≤ n) are also m-way search trees.
P K P K P K P 0 1 1 22 nn. . . .
28.08.09 IT 60101: Lecture #14 5
m-way Search Tree: Example
2 0 4 0
1 0- - 1 5 - - 2 5 - 3 0 - 4 5 - 5 0 -
- 3 5 - - -
A
B C D
E
[P ] [K P ] [K P ]0 1 1 2 2
P 0 P 1 P 2
28.08.09 IT 60101: Lecture #14 6
B-Tree Indexing
• A B tree T of order m is an m-way search tree that is either empty, or it satisfies the following properties:
– The root node has at least 2 children
– All nodes other than the root node have at least child
– All failure nodes are at the same level.
2/m
28.08.09 IT 60101: Lecture #14 7
Example: B-Tree of Order 3
3 0
2 0 4 0
1 0 1 5 2 5 3 5 4 5 5 0
F F F F F F F F F F
F = Fa ilu re node
28.08.09 IT 60101: Lecture #14 8
Operations on B-Trees
• Searching
• Insertion
• Deletion
28.08.09 IT 60101: Lecture #14 9
Insertion:10, 20, 30, 40, 50, 60, 70, 80, 90
• Insertion of 10• Initially the B tree is empty. Get a node (note that it is the
root node) and insert the key 10 into it
- 1 0 - - -
N0
1 0
28.08.09 IT 60101: Lecture #14 10
Insertion:10, 20, 30, 40, 50, 60, 70, 80, 90
• Insertion of 20• A node in a B tree of order m can have at most (m – 1) key
values. So, in this case, the root node can hold the key value
20 after 10
2 0
- 1 0 - - - 1 0 - 2 0 -
N02 0N0
28.08.09 IT 60101: Lecture #14 11
Insertion:10, 20, 30, 40, 50, 60, 70, 80, 90
• Insertion of 30• A key value is to be inserted into a node which already has the maximum
number of key values (that is, m – 1 for a B tree of order m).
• Insert the value, say X into the list of values in the node in ascending order
• Split the list of values into three parts: P1, P2 and P3
• P1 contains first – 1 key values
• P3 contains + 1, ..., m-th values
• P2 values contain the -th value.
• With this splitting, the -th value is to be inserted into the parent node of the current node
• If the parent node is nil, then create a new node.
• Note: In place of the current node, two nodes are to be allotted containing the
key values in P1 and P3 respectively.
2/m
2/m 2/m
2/m
28.08.09 IT 60101: Lecture #14 12
Insertion:10, 20, 30, 40, 50, 60, 70, 80, 90
• Insertion of 30• A key value is to be inserted into a node which already has the maximum
number of key values (that is, m – 1 for a B tree of order m).
3 0
1 0 - - 2 0
N0
3 0
3 03 0 N1
2 0 - -
- 1 0 - - - - 3 0 - -
N2 N3
28.08.09 IT 60101: Lecture #14 13
Insertion:10, 20, 30, 40, 50, 60, 70, 80, 90
• Insertion of 40• Search for the node where 40 should be placed
N1
2 0 - -
- 1 0 - - - - 3 0 - - -
N2 N3
4 0N1
2 0 - -
- 1 0 - - - - 3 0 - 4 0 -
N2 N3
28.08.09 IT 60101: Lecture #14 14
Insertion:10, 20, 30, 40, 50, 60, 70, 80, 90
• Insertion of 50• 50 should go to node N3, but it is already full. So, it will be
splitted followed by rearrangement
5 0
N1
2 0 - -
- 1 0 - - - - 3 0 - 4 0 -
N2
5 0
N3
N1
5 0 - - - - 1 0 - - - - 3 0 - - - -
N2
2 0 4 0 -
N4 N5
28.08.09 IT 60101: Lecture #14 15
Insertion:10, 20, 30, 40, 50, 60, 70, 80, 90
• Insertion of 60• Search the nodes for 60 (it is N5) and insert it there.
N1
6 0
5 0 - - - - 1 0 - - - - 3 0 - - - -
N2
2 0 4 0 -
N4 N5
6 0N1
5 0 - 6 0 - - 1 0 - - - - 3 0 - - - -
N2
2 0 4 0 -
N4 N5
28.08.09 IT 60101: Lecture #14 16
Insertion:10, 20, 30, 40, 50, 60, 70, 80, 90
• Insertion of 70• The node for insertion of key value 70 is N5, which is already full
– So, it requires to split N5 into N6 and N7
– This process in turns require to insert 60 into N1
» Requires another splitting of N1 into N8, N9, N10
N1
5 0 - - - 1 0 - - - - 3 0 - - - -
N2
2 0 4 0 -
N4 N5
6 0
7 0
7 0
6 0
N NN2 N4
- 1 0 - - - - 3 0 - - - - 5 0 - - - 7 0 - -
6 7
4 0 -
N1
2 0
N8
6 0 - - -
N
4 0 -
N9 N1 0
2 0
NN2 N4
- 1 0 - - - - 3 0 - - 5 0 - - - 7 0 - -
6 7
28.08.09 IT 60101: Lecture #14 17
Insertion:10, 20, 30, 40, 50, 60, 70, 80, 90
• Insertion of 80– Search the nodes for 60 (it is N5) and insert it there.
8 0
N8
6 0 - - -
N
4 0 -
N9 N1 0
2 0
NN2N
4
- 1 0 - - - - 3 0 - - 5 0 - - - 7 0 - -
6 7
8 0
N8
6 0 - - -
N
4 0 -
N9 N1 0
2 0
NN2N
4
- 1 0 - - - - 3 0 - - - 5 0 - - - - 7 0 - 8 0 -
6 7
28.08.09 IT 60101: Lecture #14 18
Insertion:10, 20, 30, 40, 50, 60, 70, 80, 90
• Insertion of 90– The node N7 is the right place where 90 has to be
accommodated but it is full.
9 0
N8
6 0 - - -
N
4 0 -
N9 N1 0
2 0
NN2N
4 6 7
- 1 0 - - - - 3 0 - - - - 5 0 - - - - 7 0 - 8 0 - 9 0
28.08.09 IT 60101: Lecture #14 19
Insertion:10, 20, 30, 40, 50, 60, 70, 80, 90
• Insertion of 90– The node N7 is the right place where 90 has to be
accommodated but it is full.
– So, splitting of N7 (to N11 and N12) is necessary.
• Requires the insertion of 80 into N10, the parent of N7
N8
6 0 - 8 0 -
N
9 0 - -
N9 N1 0
2 0
NN2N
4
- 1 0 - - - - 3 0 - - - - 5 0 - - - - 7 0 - - - -
6 1 1
4 0
N1 2
28.08.09 IT 60101: Lecture #14 20
Deletion in B-Trees
• Case 1: Deletion of a key value from a leaf node• Case 1.a: Removal of a key value leads to the number of keys
≥ – 1.
• Case 1.b: Removal of key value leads to the number of keys
< – 1.
• Case 2: Deletion of a key value from a non-leaf node.
2/m
2/m
28.08.09 IT 60101: Lecture #14 21
Deletion in B-Trees: Case 1.a
• Case 1: Deletion of a key value from a leaf node• Case 1.a: Removal of a key value leads to the number of keys
≥ – 1.
– Removal of a key value from the leaf node does not violate the requirement of minimum number of key values in that node
2/m
(a ) D e le tion o f l, t and y
o
c g k r v
a b d e f h i
m n
p q
s u w x z
l m n s t u w x y z
(b ) A fte r de le tion o f l, t, and y
o
c g k r v
a b d e f h i m n p q s u w x z
28.08.09 IT 60101: Lecture #14 22
Deletion in B-Trees: Case 1.b
• Case 1: Deletion of a key value from a leaf node• Case 1.b: Removal of key value leads to the number of keys
< – 1.
• Three situations may be possible in this case: 1. The nearest right sibling contains more than – 1 key
values.
2. The nearest left sibling contains more than – 1 key values.
3. Neither the nearest left sibling nor the right sibling contain more than – 1 key values.
2/m
2/m
2/m
2/m
28.08.09 IT 60101: Lecture #14 23
Deletion in B-Trees: Case 1.b
• Case 1: Deletion of a key value from a leaf node• Case 1.b: Removal of key value leads to the number of keys
< – 1.
The nearest right (or left) sibling contains more than – 1 key values.
2/m
2/m
(a ) D e le tion o f h and s
o
c g k r v
a b d e f h i p ql m n s u w x z
f
g v
w
m o v e r ig h t m o v e le f t
(b ) A fte r de le tion o f h and s
o
c f k r w
a b d e g i m n p q v u x z
28.08.09 IT 60101: Lecture #14 24
Deletion in B-Trees: Case 1.b
• Case 1: Deletion of a key value from a leaf node• Case 1.b: Removal of key value leads to the number of keys
< – 1.
Neither the nearest left sibling nor the right sibling contain more than – 1 key values.
2/m
2/m
(a ) D e le tion o f e : com b ine w ith righ t s ib ling
com bine
o
c f k r w
a b d e g i m n p q v u x z
(b ) A fte r de le tion o f e
o
c k r w
a b d f g i m n p q v u x z
28.08.09 IT 60101: Lecture #14 25
Deletion in B-Trees: Case 1.b
• Case 1: Deletion of a key value from a leaf node• Case 1.b: Removal of key value leads to the number of keys
< – 1.
Neither the nearest left sibling nor the right sibling contain more than – 1 key values.
2/m
2/m
o
c k r w
a b d f g i m n p q v u x z
com bine
o
r com binec k
a b d f g i m n p q v u w x
a b d f g i m n p q v u w x
c k o r
(c ) A fte r de le tion o f z
28.08.09 IT 60101: Lecture #14 26
Deletion in B-Trees: Case 1.b
• Case 2: Deletion of a key value from a non-leaf node
(b ) A fte r d e le tio n o f r a n d th e n d e le tio n o f g
c o m b in e (a ) D e le tio n o f r
o
c g k r v
d e f p q s t u w x y z
o
c g k s v
a b d e f m n p q t u w x y z
l m n
l
h i
l
a b
h ik
28.08.09 IT 60101: Lecture #14 27
Deletion in B-Trees: Case 1.b
• Case 2: Deletion of a key value from a non-leaf node
o
s v
a b d e f m n p q t u w x y z
c h l
k i
(c ) A fte r d e le tio n o f g a n d th e n d e le tio n o f o
p
c h l
a b d f m n w x y zk i
p v
q s t u
l
l
c h
a b d e f m n w x y zk i
p v
l
q s t u
(d ) A fte r d e le tio n o f o .
28.08.09 IT 60101: Lecture #14 28
Some Properties of B-Trees
• A B-tree is always a height balanced tree
• The degree of a B-tree of order m is m, that is, the maximum number of branches that can emanate from a node is m
• In a B-tree of order m and height h,
• The maximum number of nodes possible =
• The maximum number of key values that a node in a B tree of order m can have is m – 1.
• The maximum number of key values that is possible in a B tree of order m is
1
11
0
m
mm
hh
i
i
1hm
28.08.09 IT 60101: Lecture #14 29
Some Properties of B-Trees
• The root node contains at least 2 children
• All nodes other than the root node can have at least children.
• The minimum number of key values in the root node is 1 (if the B tree is not empty).
• The minimum number of key values in any node other than the root node is – 1.
• The minimum number of key values in a B-tree of order m is
2/m
2/m
12
21
hm
28.08.09 IT 60101: Lecture #14 30
TRIE Structure
• A trie tree is an m-way search tree
• Definition• A “trie” is a tree of order m either empty or consisting of an
ordered sequence of exactly m tries each of order m.
28.08.09 IT 60101: Lecture #14 31
TRIE Structure: Examplea b c
0
12 3
4
a b c
aa ac
5 6 7 8 9 1 0 11
1 2 1 3 1 4 1 5 1 6
1 7 1 8 1 9 2 0
2 1 2 2
2 3
b c ca cb cc
aac aabb ab b ac
cb b
ab b c b ab b b ab c
b acc
b ab b a
b ab cb a
b ab cb
- - - - - - - - -
-
- - - - - - - -
- - - - - - - -
- - - - - - - - - -
- - - - -
- - -
28.08.09 IT 60101: Lecture #14 32
Structure of a node in TRIE
• Physical representation
• Trie indexing is suitable for maintaining variable sized key values.
• Actual key value is never stored but key values are implied through links.
• If English alphabets are used, then a trie of order 26 can maintain whole English dictionary. (This is specially termed as lexicographic trie).
• It allows us multi-way branching based on the part of key value, not the entire key value. The branching on the i-th level is determined by the i-th component of the key value.
a b c
28.08.09 IT 60101: Lecture #14 33
TRIE Structure: Operations
• Operations
• Searching
• Insertion
• Deletion
For operations on trie structure see the bookClassic Data Structures
Chapter 7PHI, 2nd Edn., 17th Reprint