Learn

58
111/06/18 1 Learn Learn Question Focus and Depende Question Focus and Depende ncy Relations from ncy Relations from Web Sea Web Sea rch Results rch Results for Question C for Question C lassification lassification Wen-Hsiang Lu ( 盧盧盧 ) [email protected] Web Mining and Multilingual Knowledge System Laborat ory, Department of Computer Science and Information Engin eering, National Cheng Kung University WMMKS Lab WMMKS Lab

description

Learn Question Focus and Dependency Relations from Web Search Results for Question Classification. Wen-Hsiang Lu ( 盧文祥 ) [email protected] Web Mining and Multilingual Knowledge System Laboratory, Department of Computer Science and Information Engineering, - PowerPoint PPT Presentation

Transcript of Learn

Page 1: Learn

112/04/19 1

Learn Learn Question Focus and Dependency RelQuestion Focus and Dependency Relations from ations from Web Search Results Web Search Results for for

Question ClassificationQuestion Classification

Wen-Hsiang Lu (盧文祥 )[email protected]

Web Mining and Multilingual Knowledge System Laboratory, Department of Computer Science and Information Engineeri

ng, National Cheng Kung University

WMMKS LabWMMKS Lab

Page 2: Learn

112/04/19 2WMMKS LabWMMKS Lab

Research InterestResearch Interest

Web Mining

NaturalLanguageProcessing

Information

Retrieval

Page 3: Learn

112/04/19 3

Unknown Term Translation & Cross-Language Information Retrieval A Multi-Stage Translation Extraction Method for Unknown Terms Usi

ng Web Search Results

Question Answering & Machine Translation Using Web Search Results to Learn Question Focus and Depen

dency Relations for Question Classification Using Phrase and Fluency to Improve Statistical Machine Translation

User Modeling & Web Search Learning Question Structure based on Website Link Structure to

Improve Natural Language Search Improving Short-Query Web Search based on User Goal Identification

Cross-Language Medical Information Retrieval MMODE: http://mmode.no-ip.org/

WMMKS LabWMMKS Lab

Research IssuesResearch Issues

Page 4: Learn

112/04/19 4WMMKS LabWMMKS Lab

雅各氏症候群

Page 5: Learn

112/04/19 5

Introduction Related Work Approach Experiment Conclusion Future Work

WMMKS LabWMMKS Lab

OutlineOutline

Page 6: Learn

112/04/19 6

Introduction Related Work Approach Experiment Conclusion Future Work

WMMKS LabWMMKS Lab

OutlineOutline

Page 7: Learn

112/04/19 7WMMKS LabWMMKS Lab

Question Answering (QA) SystemQuestion Answering (QA) System

1. Question Analysis: Question Classification, Keywords Extraction.

2. Document Retrieval: Retrieve related documents.

3. Answer Extraction: Extract a exact answer.

Page 8: Learn

112/04/19 8WMMKS LabWMMKS Lab

Motivation (1/3)Motivation (1/3)

Importance of Question Classification Dan Moldovan proposed a report [Dan Moldovan 2000]

Page 9: Learn

112/04/19 9WMMKS LabWMMKS Lab

Rule-based Question Classification Manual and unrealistic method.

Motivation (2/3)Motivation (2/3)

. Need a large number of training data. . Too many features may be noise.

Machine Learning-based Question Classification

Support Vector Machine (SVM)

Page 10: Learn

112/04/19 10WMMKS LabWMMKS Lab

A new method for question classification.

Observe some useful features of question.

Solve the problem of insufficient training data.

Motivation (3/3)Motivation (3/3)

Page 11: Learn

112/04/19 11WMMKS LabWMMKS Lab

Idea of Approach (1/4)Idea of Approach (1/4)

Many questions have ambiguous question words

Importance of Question Focus (QF). Use QF identification for question classification.

Page 12: Learn

112/04/19 12WMMKS LabWMMKS Lab

If we do not have enough information to identify the type of QF.

QF Dependency Verb Dependency Quantifier Dependency Noun

Question

Question Type

: Dependency Features : Question Type

: (Unigram) Semantic Dependency Relation

: (Bigram) Semantic Dependency Relation

Idea of Approach (2/4)Idea of Approach (2/4)

Page 13: Learn

112/04/19 13WMMKS LabWMMKS Lab

Example

Idea of Approach (3/4)Idea of Approach (3/4)

Page 14: Learn

112/04/19 14WMMKS LabWMMKS Lab

Use QF and dependency features to classify questions. Learning QF and other dependency features from Web. Propose a Semantic Dependency Relation Model (SDRM).

Idea of Approach (4/4)Idea of Approach (4/4)

Page 15: Learn

112/04/19 15

Introduction Related Work Approach Experiment Conclusion Future Work

WMMKS LabWMMKS Lab

OutlineOutline

Page 16: Learn

112/04/19 16WMMKS LabWMMKS Lab

[Richard F. E. Sutcliffe 2005][Kui-Lam Kwok 2005][Ellen Riloff 2000]

Rule-based Question ClassificationRule-based Question Classification

5W(Who, When, Where, What, Why)Who → Person.When → Time.Where → Location.What → Difficult type.Why → Reason.

Page 17: Learn

112/04/19 17WMMKS LabWMMKS Lab

Several methods based on SVM. [Zhang, 2003; Suzuki, 2003; Day, 2005]

Machine Learning-based Machine Learning-based Question ClassificationQuestion Classification

KDAG Kernel SVMQuestion Feature Vector Question Type

Page 18: Learn

112/04/19 18WMMKS LabWMMKS Lab

Use a Web search engine to identify question type. [Solorio, 2004]

“Who is the President of the French Republic?”

Web-based Question ClassificationWeb-based Question Classification

Page 19: Learn

112/04/19 19WMMKS LabWMMKS Lab

Language Model for Question Classification [Li, 2002]

Too many features may be noise.

Statistics-based Question ClassificationStatistics-based Question Classification

Page 20: Learn

112/04/19 20

Introduction Related Work Approach Experiment Conclusion Future Work

WMMKS LabWMMKS Lab

OutlineOutline

Page 21: Learn

112/04/19 21WMMKS LabWMMKS Lab

Architecture of Question ClassificationArchitecture of Question Classification

Page 22: Learn

112/04/19 22WMMKS LabWMMKS Lab

6 types of questions Person Location Organization Number Date Artifact

Question TypeQuestion Type

Page 23: Learn

112/04/19 23WMMKS LabWMMKS Lab

We define 17 basic rules for simple questions.

Basic Classification RulesBasic Classification Rules

Page 24: Learn

112/04/19 24WMMKS LabWMMKS Lab

Architecture for Learning Dependency Features

Extracting Dependency Features Algorithm

Learning Semantic Learning Semantic Dependency Features (1/3)Dependency Features (1/3)

Page 25: Learn

112/04/19 25WMMKS LabWMMKS Lab

Architecture for Learning Dependency Features

Learning Semantic Learning Semantic Dependency Features (2/3)Dependency Features (2/3)

Page 26: Learn

112/04/19 26WMMKS LabWMMKS Lab

Extracting Dependency Features Algorithm

Learning Semantic Learning Semantic Dependency Features (3/3)Dependency Features (3/3)

..

Page 27: Learn

112/04/19 27WMMKS LabWMMKS Lab

Question Focus Question Focus Identification Algorithm (1/2)Identification Algorithm (1/2)

Algorithm

Page 28: Learn

112/04/19 28WMMKS LabWMMKS Lab

Example

Question Focus Question Focus Identification Algorithm (2/2)Identification Algorithm (2/2)

Page 29: Learn

112/04/19 29WMMKS LabWMMKS Lab

Unigram-SDRM

Bigram-SDRM

Semantic Dependency Semantic Dependency Relation Model (SDMR) (1/12)Relation Model (SDMR) (1/12)

Page 30: Learn

112/04/19 30WMMKS LabWMMKS Lab

Unigram-SDRM

P(C|Q) need many questions to train.

Semantic Dependency Semantic Dependency Relation Model (SDMR) (2/12)Relation Model (SDMR) (2/12)

Q

Question

C

Question Type

P(C|Q)

Page 31: Learn

112/04/19 31WMMKS LabWMMKS Lab

P(DC|C): Collect related search results by every type.

P(Q|DC): Use DC to determine the question type.

Unigram-SDRM

Semantic Dependency Semantic Dependency Relation Model (SDMR) (3/12)Relation Model (SDMR) (3/12)

C

Question

DC

Question Type

P(DC|C)Q

P(Q|DC)

Web search result

Page 32: Learn

112/04/19 32WMMKS LabWMMKS Lab

Unigram-SDRM

Semantic Dependency Semantic Dependency Relation Model (SDRM) (4/12)Relation Model (SDRM) (4/12)

Page 33: Learn

112/04/19 33WMMKS LabWMMKS Lab

Unigram-SDRM

Semantic Dependency Semantic Dependency Relation Model (SDRM) (5/12)Relation Model (SDRM) (5/12)

Q={QF,QD}, QD={DV,DQ,DN}.

DV : Dependency VerbDQ: Dependency QuantifierDN: Dependency Noun

Page 34: Learn

112/04/19 34WMMKS LabWMMKS Lab

DV={ dv1, dv2, ,⋯ dvi}, DQ={ dq1, dq2, , ⋯ dqj}, DN={ dn1, dn2, , ⋯ dnk}.

Unigram-SDRM

Semantic Dependency Semantic Dependency Relation Model (SDRM) (6/12)Relation Model (SDRM) (6/12)

Page 35: Learn

112/04/19 35WMMKS LabWMMKS Lab

P(DC|C) P(QF |DC), P(dv|DC), P(dq|DC), P(dn|DC)

Parameter Estimation of Unigram-SDRM

Semantic Dependency Semantic Dependency Relation Model (SDRM) (7/12)Relation Model (SDRM) (7/12)

N(QF): The number of occurrence of the QF in Q. NQF(DC): Total number of all QF collected from search results.

Page 36: Learn

112/04/19 36WMMKS LabWMMKS Lab

Semantic Dependency Semantic Dependency Relation Model (SDRelation Model (SDRMRM) (8/12)) (8/12)

Parameter Estimation of Unigram-SDRM

Page 37: Learn

112/04/19 37WMMKS LabWMMKS Lab

Bigram-SDRM

Semantic Dependency Semantic Dependency Relation Model (SDRM) (9/12)Relation Model (SDRM) (9/12)

Page 38: Learn

112/04/19 38WMMKS LabWMMKS Lab

Bigram-SDRM

Semantic Dependency Semantic Dependency Relation Model (SDRM) (10/12)Relation Model (SDRM) (10/12)

Page 39: Learn

112/04/19 39WMMKS LabWMMKS Lab

Parameter Estimation of Bigram-SDRM

P(DC|C): The same as Unigram-SDRM P(QF|DC): The same as Unigram-SDRM P(dV|QF,DC), P(dQ|QF,DC), P(dN|QF,DC)

Nsentence(dv,QF): The number of sentence containing dv and QF. Nsentence(QF): Total number of sentence containing QF.

Semantic Dependency Semantic Dependency Relation Model (SDRM) (11/12)Relation Model (SDRM) (11/12)

Page 40: Learn

112/04/19 40WMMKS LabWMMKS Lab

Parameter Estimation of Bigram-SDRM

Semantic Dependency Semantic Dependency Relation Model (SDRM) (12/12)Relation Model (SDRM) (12/12)

Page 41: Learn

112/04/19 41

Introduction Related Work Approach Experiment Conclusion Future Work

WMMKS LabWMMKS Lab

OutlineOutline

Page 42: Learn

112/04/19 42WMMKS LabWMMKS Lab

SDRM Performance Evaluation

ExperimentExperiment

. Unigram-SDRM v.s. Bigram-SDRM

. Combination with different weights

SDRM v.s. Language Model. Use questions as training data

. Use Web as training data

. Questions v.s. Web

Page 43: Learn

112/04/19 43WMMKS LabWMMKS Lab

Collect questions from NTCIR-5 CLQA. 4-fold cross-validation.

Experimental DataExperimental Data

Page 44: Learn

112/04/19 44WMMKS LabWMMKS Lab

Result

Unigram-SDRM v.s. Bigram-SDRMUnigram-SDRM v.s. Bigram-SDRM

Page 45: Learn

112/04/19 45WMMKS LabWMMKS Lab

Example

For unigram: “ 人” ,” 創下” ,” 駕駛” are trained successfully.

For bigram: “ 人 _ 創下” are not trained successfully.

Unigram-SDRM v.s. Bigram-SDRM (2/2Unigram-SDRM v.s. Bigram-SDRM (2/2))

Page 46: Learn

112/04/19 46WMMKS LabWMMKS Lab

Different weights for different features

α: The weight of QF, β: The weight of dV, γ: The weight of dQ, δ: The weight of dN.

Combination with different weight (1/3)Combination with different weight (1/3)

Page 47: Learn

112/04/19 47WMMKS LabWMMKS Lab

Comparison of 4 dependency features

Combination with different weight (2/3)Combination with different weight (2/3)

Page 48: Learn

112/04/19 48WMMKS LabWMMKS Lab

16 experimentsBest weighting: 0.23QF, 0.29DV, 0.48DQ.To solve some problem about mathematics. Example: QF and DV

α: The weight of QF

β: The weight of DV.

α=(1-0.77)/[(1-0.77)+(1-0.71)]

β=(1-0.71)/ [(1-0.77)+(1-0.71)]

Combination with different weight (3/3)Combination with different weight (3/3)

Page 49: Learn

112/04/19 49WMMKS LabWMMKS Lab

Result

Use questions as training data (1/2)Use questions as training data (1/2)

Page 50: Learn

112/04/19 50WMMKS LabWMMKS Lab

Example

Use questions as training data (2/2)Use questions as training data (2/2)

For LM: “ 網球選手” ,” 選手為” are not trained successfully.

For SDRM: “ 選手” , ” 奪得” are trained successfully.

Page 51: Learn

112/04/19 51WMMKS LabWMMKS Lab

Result

Use Web search results as Use Web search results as training data (1/2)training data (1/2)

Page 52: Learn

112/04/19 52WMMKS LabWMMKS Lab

Example

For LM: “ 何國” are not trained successfully.

For SDRM: “ 國” , ” 設於” are trained successfully.

Use Web search results as Use Web search results as training data (2/2)training data (2/2)

Page 53: Learn

112/04/19 53WMMKS LabWMMKS Lab

Result

Question v.s. Web (1/3)Question v.s. Web (1/3)

Trained Question: LM can train QF of the question. Untrained Question: LM can’t train QF of the question.

Page 54: Learn

112/04/19 54WMMKS LabWMMKS Lab

Example of trained question

Question v.s. Web (2/3)Question v.s. Web (2/3)

For LM: “ 何地” are trained successfully.

For SDRM: “ 地” , ” 舉行” are trained successfully, but these

terms are also trained on other types.

Page 55: Learn

112/04/19 55WMMKS LabWMMKS Lab

Example of untrained question

Question vs. Web (3/3)Question vs. Web (3/3)

For LM: “ 女星” , ” 獲得” are not trained successfully.

For SDRM: “ 女星” , ” 獲得” are trained successfully.

Page 56: Learn

112/04/19 56WMMKS LabWMMKS Lab

Discussion

ConclusionConclusion

We need to enhance our learning method and performance. We need better smoothing method.

Conclusion We propose a new model SDRM which uses

question focus and dependency features for question

classification. Use Web search results as training data to solve the

problem of insufficient training data.

Page 57: Learn

112/04/19 57WMMKS LabWMMKS Lab

Further works in the future

Future WorkFuture Work

Enhance the performance of learning method. Consider the importance of features in the question.Question focus and dependency features may be

used for other process steps of question answer systems.

Page 58: Learn

112/04/19 58

Thank YouThank You

WMMKS LabWMMKS Lab