Data Structures and Algorithms 12 · Data Structures and Algorithms(12) Instructor: Ming Zhang...
Transcript of Data Structures and Algorithms 12 · Data Structures and Algorithms(12) Instructor: Ming Zhang...
Data Structures and Algorithms(12)
Instructor: Ming ZhangTextbook Authors: Ming Zhang, Tengjiao Wang and Haiyan Zhao
Higher Education Press, 2008.6 (the "Eleventh Five-Year" national planning textbook)
https://courses.edx.org/courses/PekingX/04830050x/2T2014/
Ming Zhang "Data Structures and Algorithms"
2
目录页
Ming Zhang “Data Structures and Algorithms"
Chapter 12
Advanced Data
Structure
Chapter 12 Advanced data structure
• 12.1 Multidimensional Array
• 12.2 Generalized Lists
• 12.3 Storage management
• 12.4 Trie
• 12.5 Improved binary search tree
3
目录页
Ming Zhang “Data Structures and Algorithms"
Chapter 12
Advanced Data
Structure
12.3 Trie
• Ideal situation: The average time of insertion,
deletion, and search is O(logN)
• Input 9, 4, 2, 6, 7, 15, 12, 21
• Output 2, 4, 6, 7, 9, 12, 15, 21
9
154
62 12
7
21
9
15
4
6
2
12
7
21
4
目录页
Ming Zhang “Data Structures and Algorithms"
Chapter 12
Advanced Data
Structure
Structure of Trie
• Space division of key
• “trie” comes from “retrieval”
• Application
• Information retrieval
• Large scale of English dictionary
• 26-branch Trie
• Binary Trie
• Letters (numbers) represented as binary coding
• Coding includes just 0 and 1
12.4 Trie
5
目录页
Ming Zhang “Data Structures and Algorithms"
Chapter 12
Advanced Data
Structure
Store words ‘and, ant, bad, bee’
Tree of English words: 26-branch Trie
12.4 Trie
Subtree ‘an’ contains
set {and, ant} that
every word from the
set has the same
prefix ‘an’.
a
n
d t
b
a
d
e
e
and ant bad bee
A subtree contains the words with the same prefix
6
目录页
Ming Zhang “Data Structures and Algorithms"
Chapter 12
Advanced Data
Structure
12.4 Trie
、Store words an , and ant , bad , bee
a
n
dt
b
a e
and antbad bee
an
*
* ** *
7
目录页
Ming Zhang “Data Structures and Algorithms"
Chapter 12
Advanced Data
Structure
Compact the Single Paths close to the leaf
12.4 Trie
Store words an、and、ant、bad、bee
a
n
dt
b
a e
and ant
bad bee
an
*
8
目录页
Ming Zhang “Data Structures and Algorithms"
Chapter 12
Advanced Data
Structure
Binary Trie
12.4 Trie
Elements are 2、5、9、17、41、45、63
0 (<32) 1 (>32)
0 (<16) 1 (>16)
0 (<8)
0 (<4)
1 (>48)
1 (>8)
1 (>4)
2 5
9
17
41
631 (>40)
45
0
(<44) 1 (>44)
0 (<48)
9
目录页
Ming Zhang “Data Structures and Algorithms"
Chapter 12
Advanced Data
Structure
PATRICIA Structure
12.4 Trie
101xxx
10xxxx
0000xx
001xxx
0001xx
Code: 2:000010 5:000101 9:001001
17:010001 41:101001 45:101101 63:111111
0xxxxx 1xxxxx
00xxxx 01xxxx 11xxxx
000xxx
2 5
9
17 63
0
1 1
2 2
3
41
1010xx 1011xx
45
3
Compression
10
目录页
Ming Zhang “Data Structures and Algorithms"
Chapter 12
Advanced Data
Structure
Characteristics of PATRICIA Tree
• The compressed PATRICIA tree is a
full binary tree
• Every internal node represents a 1-bit
comparison
• Always at least two children are
generated
• The number of comparisons will
not exceed the length of the key
12.4 Trie
11
目录页
Ming Zhang “Data Structures and Algorithms"
Chapter 12
Advanced Data
Structure
12.4 Trie
Suffix Trees
c
a
ab
ababc
abc
b
a
b
c
c
b
a
b
c
c
c
b
babc
bc
Suffix Trie
ab
ababc
ab
abcc
bc
abcc
abc
babc
bc
c
b
Suffix Tree
implicit statesexplicit states
T = ababc
Ascending Orderababcabcabcbcc
12
目录页
Ming Zhang “Data Structures and Algorithms"
Chapter 12
Advanced Data
Structure
12.4 Trie
Suffix Array
3 1 1 0 2 0 1 0 0 -
5 ALAM$
1 ALAYALAM$
7 AM$
3 AYALAM$
6 LAM$
2 LAYALAM$
0 MALAYALAM$
8 M$
4 YALAM$
9 $
M A L A Y A L A M $0 1 2 3 4 5 6 7 8 9
5 1 7 3 6 2 0 8 4 9
Suffix Array
The longest common prefix arraySuffix 5 and Suffix 1 share “ALA”Suffix 1 and Suffix 7 share “A” LCP always adjacent
13
目录页
Ming Zhang “Data Structures and Algorithms"
Chapter 12
Advanced Data
Structure
3 1 1 0 2 0 1 0 0 -
12.4 Trie
5 ALAM$
1 ALAYALAM$
7 AM$
3 AYALAM$
6 LAM$
2 LAYALAM$
0 MALAYALAM$
8 M$
4 YALAM$
9 $SA
lcp
LA
5 1
7 3 4 2
D = 3
D = 1
D = 0
D = 2
5 1 7 3 6 2 0 8 4 9
14
目录页
Ming Zhang “Data Structures and Algorithms"
Chapter 12
Advanced Data
Structure
Discussions
•Can Trie handle Chinese characters?
What about PATRICIA Trie structure?
• Learn related document about
Suffix Array and Suffix Tree. And
think about their applications.
12.4 Trie
Data Structures and Algorithms
Thanks
the National Elaborate Course (Only available for IPs in China)http://www.jpk.pku.edu.cn/pkujpk/course/sjjg/
Ming Zhang, Tengjiao Wang and Haiyan ZhaoHigher Education Press, 2008.6 (awarded as the "Eleventh Five-Year" national planning textbook)
Ming Zhang “Data Structures and Algorithms”