Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교...

40
Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 년 04 년 년년년년년 년년년년년년년 년년년 Text : Speech and Language Processing Page. 141 ~ 189

Transcript of Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교...

Page 1: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

Chapter 5. Probabilistic Models of Pronunciation and Spelling

2007 년 05 월 04 일

부산대학교 인공지능연구실 김민호

Text : Speech and Language ProcessingPage. 141 ~ 189

Page 2: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

Outline

Introduction 5.1 Dealing with Spelling Errors 5.2 Spelling Error Patterns 5.3 Detecting Non-Word Errors 5.4 Probabilistic Models 5.5 Applying the Bayesian Method to Spelling 5.6 Minimum Edit Distance 5.7 English Pronunciation Variation 5.8 The Bayesian Method for Pronunciation 5.9 Weighted Automata 5.10 Pronunciation in Humans

Page 3: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

Introduction

Introduce the problems of detecting and correcting spelling errors

Summarize typical human spelling error patterns The essential probabilistic architecture:

Bayes Rule Noisy channel model

The essential algorithm Dynamic programming Viterbi algorithm Minimum edit distance algorithm Forword algorithm Weighted automaton

3 / 40

Page 4: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.1 Dealing with Spelling Errors (1/2)

The detection and correction of spelling error integral part of modern word-processors

Applications in which even the individual letter aren’t guaranteed to be accurately identified Optical character recognition (OCR) On-line handwriting recognition

Detection and correction of spelling errors, mainly in typed text

OCR systems often misread “D” as “O” or “ri” as “n” producing ‘mis-spelled’ words like dension for derision

4 / 40

Page 5: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.1 Dealing with Spelling Errors (2/2)

Kukich (1992) breaks the field down into three increasingly broader problems: non-word error detection (graffe for giraffe) isolated-word error correction (correcting graffe to giraffe) context-dependent error detection and correction

- there for three, dessert for desert, piece for peace

5 / 40

Page 6: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.2 Spelling Error Patterns (1/2)

Single-error misspellings - Damerau (1964) insertion: mistyping the as ther deletion: mistyping the as th substitution: mistyping the as thw transposition: mistyping the as the

Kukich (1992) breaks down human typing error Typographic errors (spell as speel) Cognitive errors (separate as seperate)

6 / 40

Page 7: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.2 Spelling Error Patterns (2/2)

OCR errors are usually grouped into five classes substitutions (e →c) multi-substitutions (m →rn, he →b) space deletions or insertions failures (u →~)

framing errors

7 / 40

Page 8: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.3 Detecting Non-word Errors

Detecting non-word errors in text is done by the use of dictionary dictionaries would need to be kept small large dictionaries contain very rare words that resemble

misspellings of other words

8 / 40

Page 9: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.4 Probabilistic Models (1/3)

The intuition of the noisy channel model is to treat the surface form as an instance of the lexical form

to build a model of the channel so that we can figure out how it modified this “true” word and recover it

source of noise variation in pronunciation, variation in the realization of phones,

acoustic variation due to the channel

9 / 40

Page 10: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.4 Probabilistic Models (2/3)

string of phones (say [ni]) word corresponds to this string of phones consider all possible words P (word | observation) is highest

(5.1) : our estimate of the correct w O : the observation sequence [ni] function argmaxx f(x) : the x such that f(x) is maximized

10 / 40

Page 11: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.4 Probabilistic Models (3/3)

(5.2) (5.3)

substituting (5.2) into (5.1) to get (5.3) we can ignore P(O). Why?

(5.4) P(w) is called the Prior probability P(O|w) is called the likelihood

11 / 40

Page 12: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.5 Applying the Bayesian Method to Spelling (1/5)

12 / 40

Page 13: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.5 Applying the Bayesian Method to Spelling (2/5)

13 / 40

Page 14: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.5 Applying the Bayesian Method to Spelling (3/5)

p(acress|across) → number of times that e was substituted for 0 in some large corpus of error

confusion matrix a square 26 * 26 table number of times one letter was incorrectly used instead of

another [o,e] in a substitution confusion matrix

- count of times e was substitution for o

14 / 40

Page 15: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.5 Applying the Bayesian Method to Spelling (4/5)

del[x,y] contains the number of times in the training set that the characters xy in the correct word were typed as x

ins[x,y] contains the number of times in the training set that the character x in the correct word was typed as xy

sub[x,y] the number of times that x was typed as y

trans[x,y] the number of times that xy was typed as yx

15 / 40

Page 16: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.5 Applying the Bayesian Method to Spelling (5/5)

16 / 40

Page 17: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.6 Minimum Edit Distance (1/6)

string distance - some metric of how alike two strings are to each other

minimum edit distance - the minimum number of editing operations needed to transform one string into another operation - insertion, deletion, substitution

For example the gap between intention and execution is five operation trace, alignment, operation list (Figure 5.4.)

17 / 40

Page 18: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.6 Minimum Edit Distance (2/6)

18 / 40

Page 19: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.6 Minimum Edit Distance (3/6)

Levenshtein distance assign a particular cost or weight to each of operations simplest weighting factor three operation has a cost of 1 Levenshtein distance between intention and execution is 5 alternate version - substitutions has a cost of 2 (why?)

The minimum edit distance is computed by dynamic programming

19 / 40

Page 20: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.6 Minimum Edit Distance (4/6)

Dynamic programming large problem can be solved by properly combining the

solution to various subproblems minimum edit distance for spelling error correction Viterbi and the forward for speech recognition CYK and Earley for parsing

20 / 40

Page 21: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.6 Minimum Edit Distance (5/6)

21/ 40

Page 22: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.6 Minimum Edit Distance (6/6)

22 / 40

Page 23: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.8 The Bayesian Method for Pronunciation (1/6)

Bayesian algorithm can be used to solve what is often called the pronunciation subproblem in speech recognition

when [ni] occurs after the word I at the beginning of a sentence investigation of the Switchboard corpus produces a total of

7 words the, neat, need, new, knee, to, you (Chapter 4 참고 )

two components candidate generation candidate scoring

23 / 40

Page 24: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.8 The Bayesian Method for Pronunciation (2/6)

Speech recognizers often use an alternative architecture, trading off speech for storage

each pronunciation is expanded in advance with all possible variants, which are then pre-stored with their scores

Thus there is no need for candidate generation the word [ni] is simply stored with the list of words

that can generate it

24 / 40

Page 25: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.8 The Bayesian Method for Pronunciation (3/6)

y represents the sequence of phones w represents the candidate word

it turns out that confusion matrices don't do as well for pronunciation the changes in pronunciation between a lexical and surface

form are much greater probabilistic models of pronunciation variation include a

lot more factors than a simple confusion matrix can include

One simple way to generate pronunciation likelihoods is via probabilistic rules

25 / 40

Page 26: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.8 The Bayesian Method for Pronunciation (4/6)

a word-initial [δ] becomes [n] if the preceding word ended in [n] or sometimes [m]

ncout : number of times lexical [δ] is realized word initially by surface [n] when the previous word ends in a nasal

envcount : total number of times lexical [δ] occurs when the previous word ends in a nasal

26 / 40

Page 27: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.8 The Bayesian Method for Pronunciation (5/6)

27/ 40

Page 28: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.8 The Bayesian Method for Pronunciation (6/6)

Decision Tree Models of Pronunciation Variation

28 / 40

Page 29: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.9 Weighted Automata (1/12)

Weighted Automata simple augmentation of the finite automaton each arc is associated with a probability the probability on all the arcs leaving a node must sum to 1

29/ 40

Page 30: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.9 Weighted Automata (2/12)

30 / 40

Page 31: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.9 Weighted Automata (3/12)

31 / 40

Page 32: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.9 Weighted Automata (4/12)

3 2/ 40

Page 33: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.9 Weighted Automata (5/12)

3 3/ 40

Page 34: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.9 Weighted Automata (6/12)

3 4/ 40

Page 35: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.9 Weighted Automata (7/12)

35 / 40

Page 36: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.9 Weighted Automata (8/12)

36 / 40

Page 37: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.9 Weighted Automata (9/12)

37 / 40

Page 38: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.9 Weighted Automata (10/12)

38 / 40

Page 39: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.9 Weighted Automata (11/12)

39 / 40

Page 40: Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141

5.9 Weighted Automata (12/12)

40 / 40