Data Structures and Algorithms 12...Ming Zhang “Data Structures and Algorithms" Chapter 12...

15
Data Structures and Algorithms(12) Instructor: Ming Zhang Textbook Authors: Ming Zhang, Tengjiao Wang and Haiyan Zhao Higher Education Press, 2008.6 (the "Eleventh Five-Year" national planning textbook) https://courses.edx.org/courses/PekingX/04830050x/2T2014/ Ming Zhang "Data Structures and Algorithms"

Transcript of Data Structures and Algorithms 12...Ming Zhang “Data Structures and Algorithms" Chapter 12...

Page 1: Data Structures and Algorithms 12...Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English words: 26-branch

Data Structures and Algorithms(12)

Instructor: Ming ZhangTextbook Authors: Ming Zhang, Tengjiao Wang and Haiyan Zhao

Higher Education Press, 2008.6 (the "Eleventh Five-Year" national planning textbook)

https://courses.edx.org/courses/PekingX/04830050x/2T2014/

Ming Zhang "Data Structures and Algorithms"

Page 2: Data Structures and Algorithms 12...Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English words: 26-branch

2

目录页

Ming Zhang “Data Structures and Algorithms"

Chapter 12

Advanced Data

Structure

Chapter 12 Advanced data structure

• 12.1 Multidimensional Array

• 12.2 Generalized Lists

• 12.3 Storage management

• 12.4 Trie

• 12.5 Improved binary search tree

Page 3: Data Structures and Algorithms 12...Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English words: 26-branch

3

目录页

Ming Zhang “Data Structures and Algorithms"

Chapter 12

Advanced Data

Structure

12.3 Trie

• Ideal situation: The average time of insertion,

deletion, and search is O(logN)

• Input 9, 4, 2, 6, 7, 15, 12, 21

• Output 2, 4, 6, 7, 9, 12, 15, 21

9

154

62 12

7

21

9

15

4

6

2

12

7

21

Page 4: Data Structures and Algorithms 12...Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English words: 26-branch

4

目录页

Ming Zhang “Data Structures and Algorithms"

Chapter 12

Advanced Data

Structure

Structure of Trie

• Space division of key

• “trie” comes from “retrieval”

• Application

• Information retrieval

• Large scale of English dictionary

• 26-branch Trie

• Binary Trie

• Letters (numbers) represented as binary coding

• Coding includes just 0 and 1

12.4 Trie

Page 5: Data Structures and Algorithms 12...Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English words: 26-branch

5

目录页

Ming Zhang “Data Structures and Algorithms"

Chapter 12

Advanced Data

Structure

Store words ‘and, ant, bad, bee’

Tree of English words: 26-branch Trie

12.4 Trie

Subtree ‘an’ contains

set {and, ant} that

every word from the

set has the same

prefix ‘an’.

a

n

d t

b

a

d

e

e

and ant bad bee

A subtree contains the words with the same prefix

Page 6: Data Structures and Algorithms 12...Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English words: 26-branch

6

目录页

Ming Zhang “Data Structures and Algorithms"

Chapter 12

Advanced Data

Structure

12.4 Trie

、Store words an , and ant , bad , bee

a

n

dt

b

a e

and antbad bee

an

*

* ** *

Page 7: Data Structures and Algorithms 12...Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English words: 26-branch

7

目录页

Ming Zhang “Data Structures and Algorithms"

Chapter 12

Advanced Data

Structure

Compact the Single Paths close to the leaf

12.4 Trie

Store words an、and、ant、bad、bee

a

n

dt

b

a e

and ant

bad bee

an

*

Page 8: Data Structures and Algorithms 12...Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English words: 26-branch

8

目录页

Ming Zhang “Data Structures and Algorithms"

Chapter 12

Advanced Data

Structure

Binary Trie

12.4 Trie

Elements are 2、5、9、17、41、45、63

0 (<32) 1 (>32)

0 (<16) 1 (>16)

0 (<8)

0 (<4)

1 (>48)

1 (>8)

1 (>4)

2 5

9

17

41

631 (>40)

45

0

(<44) 1 (>44)

0 (<48)

Page 9: Data Structures and Algorithms 12...Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English words: 26-branch

9

目录页

Ming Zhang “Data Structures and Algorithms"

Chapter 12

Advanced Data

Structure

PATRICIA Structure

12.4 Trie

101xxx

10xxxx

0000xx

001xxx

0001xx

Code: 2:000010 5:000101 9:001001

17:010001 41:101001 45:101101 63:111111

0xxxxx 1xxxxx

00xxxx 01xxxx 11xxxx

000xxx

2 5

9

17 63

0

1 1

2 2

3

41

1010xx 1011xx

45

3

Compression

Page 10: Data Structures and Algorithms 12...Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English words: 26-branch

10

目录页

Ming Zhang “Data Structures and Algorithms"

Chapter 12

Advanced Data

Structure

Characteristics of PATRICIA Tree

• The compressed PATRICIA tree is a

full binary tree

• Every internal node represents a 1-bit

comparison

• Always at least two children are

generated

• The number of comparisons will

not exceed the length of the key

12.4 Trie

Page 11: Data Structures and Algorithms 12...Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English words: 26-branch

11

目录页

Ming Zhang “Data Structures and Algorithms"

Chapter 12

Advanced Data

Structure

12.4 Trie

Suffix Trees

c

a

ab

ababc

abc

b

a

b

c

c

b

a

b

c

c

c

b

babc

bc

Suffix Trie

ab

ababc

ab

abcc

bc

abcc

abc

babc

bc

c

b

Suffix Tree

implicit statesexplicit states

T = ababc

Ascending Orderababcabcabcbcc

Page 12: Data Structures and Algorithms 12...Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English words: 26-branch

12

目录页

Ming Zhang “Data Structures and Algorithms"

Chapter 12

Advanced Data

Structure

12.4 Trie

Suffix Array

3 1 1 0 2 0 1 0 0 -

5 ALAM$

1 ALAYALAM$

7 AM$

3 AYALAM$

6 LAM$

2 LAYALAM$

0 MALAYALAM$

8 M$

4 YALAM$

9 $

M A L A Y A L A M $0 1 2 3 4 5 6 7 8 9

5 1 7 3 6 2 0 8 4 9

Suffix Array

The longest common prefix arraySuffix 5 and Suffix 1 share “ALA”Suffix 1 and Suffix 7 share “A” LCP always adjacent

Page 13: Data Structures and Algorithms 12...Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English words: 26-branch

13

目录页

Ming Zhang “Data Structures and Algorithms"

Chapter 12

Advanced Data

Structure

3 1 1 0 2 0 1 0 0 -

12.4 Trie

5 ALAM$

1 ALAYALAM$

7 AM$

3 AYALAM$

6 LAM$

2 LAYALAM$

0 MALAYALAM$

8 M$

4 YALAM$

9 $SA

lcp

LA

5 1

7 3 4 2

D = 3

D = 1

D = 0

D = 2

5 1 7 3 6 2 0 8 4 9

Page 14: Data Structures and Algorithms 12...Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English words: 26-branch

14

目录页

Ming Zhang “Data Structures and Algorithms"

Chapter 12

Advanced Data

Structure

Discussions

•Can Trie handle Chinese characters?

What about PATRICIA Trie structure?

• Learn related document about

Suffix Array and Suffix Tree. And

think about their applications.

12.4 Trie

Page 15: Data Structures and Algorithms 12...Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English words: 26-branch

Data Structures and Algorithms

Thanks

the National Elaborate Course (Only available for IPs in China)http://www.jpk.pku.edu.cn/pkujpk/course/sjjg/

Ming Zhang, Tengjiao Wang and Haiyan ZhaoHigher Education Press, 2008.6 (awarded as the "Eleventh Five-Year" national planning textbook)

Ming Zhang “Data Structures and Algorithms”