A Structured Model for Joint Learning of Argument Roles and Predicate Senses

21
A Structured Model for Joint Learning of Argument Roles and Predicate Senses Yotaro Watanabe Masayuki Asahara Yuji Matsumoto ACL 2010 Uppsala, Sweden July 12, 2010 Tohoku University Nara Institute of Science and Technology

description

A Structured Model for Joint Learning of Argument Roles and Predicate Senses. Yotaro Watanabe Masayuki Asahara Yuji Matsumoto. Tohoku University Nara Institute of Science and Technology. ACL 2010 Uppsala, Sweden July 12, 2010. - PowerPoint PPT Presentation

Transcript of A Structured Model for Joint Learning of Argument Roles and Predicate Senses

Page 1: A Structured Model for Joint Learning of   Argument Roles and Predicate Senses

A Structured Model for Joint Learning of Argument Roles and Predicate Senses

Yotaro Watanabe Masayuki Asahara Yuji Matsumoto

ACL 2010Uppsala, SwedenJuly 12, 2010

Tohoku University Nara Institute of Science and Technology

Page 2: A Structured Model for Joint Learning of   Argument Roles and Predicate Senses

Page 2

Predicate-Argument Structure Analysis(Semantic Role Labeling)

Task of analyzing predicates and its arguments– A predicate represents a state or an event, and its arguments have relations to

the predicate– Each of arguments has a particular semantic role (Agent, Theme, etc)

In recent years, predicate sense disambiguation has been included in predicate-argument structure analysis [Surdeanu+ 08, Hajič+ 09]

– ‘sell.01’ means that ‘sold’ is an instance of the first sense of ‘sell’

Important for many NLP applications– MT, QA, RTE, etc.

ThemeLocation

Temporalluxury auto maker lastThe year sold 1,214 cars in the U.S.

maker.01 sell.01

Product Agent Agent

Page 3: A Structured Model for Joint Learning of   Argument Roles and Predicate Senses

Page 3

drive.01: drive a vehicle A0: driver A1: vehicle

drive.02: cause to move A0: driver A1: things in motion

Two Types of Dependencies of Elements in Predicate-Argument Structures

(1) Inter-dependencies between a predicate and its arguments– A1: car => we can infer that the correct sense is drive.01

(2) Non-local dependencies among arguments• Two or more arguments do not have the same role• Basically, obligatory roles of the predicate should appear in sentences

drive.01A0 A1

SBJ NMODOBJ

Paul drove his car

In order to realize robust predicate-argument structure analysis, it is necessary to deal with these types of dependencies

Page 4: A Structured Model for Joint Learning of   Argument Roles and Predicate Senses

Page 4

Previous Work

(1) Non-local dependencies among arguments: Re-ranking [Johansson and Nugues 2008, etc.]

• Generate N-best assignments of argument roles, then obtain global features for each assignment, finally select the argmax using the re-ranker

• Can not explicitly capture inter-dependencies between a predicate and its arguments

(2) Inter-dependencies between a predicate and its arguments: Markov Logic Networks [Meza-Ruiz and Riedel 2009, etc.]

• Jointly learn and classify pred. senses and arg. roles simultaneously• MLN can not deal with particular types of global features

Currently, no existing (discriminative) approach sufficiently handles both types of dependencies

Page 5: A Structured Model for Joint Learning of   Argument Roles and Predicate Senses

Page 5

Previous Work

(1) Non-local dependencies among arguments: Re-ranking [Johansson and Nugues 2008, etc.]

• Generate N-best assignments of argument roles, then obtain global features for each assignment, finally select the argmax using the re-ranker

• Can not explicitly capture inter-dependencies between a predicate and its arguments

(2) Inter-dependencies between a predicate and its arguments: Markov Logic Networks [Meza-Ruiz and Riedel 2009, etc.]

• Jointly learn and classify pred. senses and arg. roles simultaneously• MLN can not deal with particular types of global features

Currently, no existing (discriminative) approach sufficiently handles both types of dependenciesWe propose a structured model that can capture both

types of dependencies simultaneously

Page 6: A Structured Model for Joint Learning of   Argument Roles and Predicate Senses

Page 6

The proposed model

SBJ NMODOBJ

Paul drove his car

drive.01: drive a vehicle A0: driver A1: vehicle

drive.02: cause to move A0: driver A1: things in motion

Page 7: A Structured Model for Joint Learning of   Argument Roles and Predicate Senses

Page 7

The proposed model

A0

drive.01 drive.02

…A1 A1A0 …Paul car

drove

NONE NONE

SBJ NMODOBJ

Paul drove his car

drive.01: drive a vehicle A0: driver A1: vehicle

drive.02: cause to move A0: driver A1: things in motion

Expand the possible labels of predicate senses and argument roles

Page 8: A Structured Model for Joint Learning of   Argument Roles and Predicate Senses

Page 8

The proposed model

A0

drive.01 drive.02

…A1 A1A0 …Paul car

drove

NONE NONE

SBJ NMODOBJ

Paul drove his car

drive.01: drive a vehicle A0: driver A1: vehicle

drive.02: cause to move A0: driver A1: things in motion

Expand the possible labels of predicate senses and argument roles

We use four types of factors which score labels of elements in predicate-argument structures

Page 9: A Structured Model for Joint Learning of   Argument Roles and Predicate Senses

Page 9

The proposed model

A0

drive.01 drive.02

…A1 A1A0 …Paul car

drove

NONE NONE

SBJ NMODOBJ

Paul drove his car

drive.01: drive a vehicle A0: driver A1: vehicle

drive.02: cause to move A0: driver A1: things in motion

Expand the possible labels of predicate senses and argument roles

These factors are defined by (linear model)

We use four types of factors which score labels of elements in predicate-argument structures

Page 10: A Structured Model for Joint Learning of   Argument Roles and Predicate Senses

Page 10

A0

drive.01 drive.02

…A1 A1A0 …Paul car

drove

NONE NONE

The proposed model

1.4754 0.7268

SBJ NMODOBJ

Paul drove his car

FP

drive.01: drive a vehicle A0: driver A1: vehicle

drive.02: cause to move A0: driver A1: things in motion

use a factor which scores sense labels of the predicate

Page 11: A Structured Model for Joint Learning of   Argument Roles and Predicate Senses

Page 11

A0

drive.01 drive.02

…A1 A1A0 …Paul car

drove

NONE NONE

The proposed model

1.784 0.238 -1.665 -1.235 0.876 -1.482

SBJ NMODOBJ

Paul drove his car

FA

FP

drive.01: drive a vehicle A0: driver A1: vehicle

drive.02: cause to move A0: driver A1: things in motion

use a factor which scores role labels of each argument

Page 12: A Structured Model for Joint Learning of   Argument Roles and Predicate Senses

Page 12

A0 …A1 A1A0 …Paul car

drove

NONE NONE

The proposed model

0.764 0.261

SBJ NMODOBJ

Paul drove his car

FPA

FA

FP

drive.01: drive a vehicle A0: driver A1: vehicle

drive.02: cause to move A0: driver A1: things in motion

drive.01 drive.02

add a factor which scores label pairs of a predicate sense and a semantic role of an argument

Page 13: A Structured Model for Joint Learning of   Argument Roles and Predicate Senses

Page 13

The proposed model

drive.02

…A1 A0 …Paul car

drove

NONE NONE

A0,drive01,A1… 1.865

A0

drive.01

A1

SBJ NMODOBJ

Paul drove his car

FP

FPA

FA

FG

drive.01: drive a vehicle A0: driver A1: vehicle

drive.02: cause to move A0: driver A1: things in motion

add a factor which captures plausibility of the whole predicate-argument structure(use global features)

Page 14: A Structured Model for Joint Learning of   Argument Roles and Predicate Senses

Page 14

The proposed model

drive.02

…A1 A0 …Paul car

drove

NONE NONE

A0,drive01,A1… 1.865

A0

drive.01

A1

SBJ NMODOBJ

Paul drove his car

FP

FPA

FA

FG

drive.01: drive a vehicle A0: driver A1: vehicle

drive.02: cause to move A0: driver A1: things in motion

add a factor which captures plausibility of the whole predicate-argument structure(use global features)

The predicate ‘drive’ has all obligatory roles A0 and A1=> FG assigns the higher score to the weight corresponds to this feature

Page 15: A Structured Model for Joint Learning of   Argument Roles and Predicate Senses

Page 15

The proposed model

drive.02

…A1 A0 …Paul car

drove

NONEA0

drive.01

A1 NONE1.784 0.238 -1.665 -1.235 0.876 -1.482

1.4754 0.7268

0.7640.425

SBJ NMODOBJ

Paul drove his car

A0,drive01,A1… 1.865

FP

FPA

FA

FG

drive.01: drive a vehicle A0: driver A1: vehicle

drive.02: cause to move A0: driver A1: things in motion

The proposed model combines these types of factors

Page 16: A Structured Model for Joint Learning of   Argument Roles and Predicate Senses

Page 16

The proposed model

drive.02

…A1 A0 …Paul car

drove

NONEA0

drive.01

A1 NONE

drive.01A0 A1

1.784 0.238 -1.665 -1.235 0.876 -1.482

1.4754 0.7268

0.7640.425

SBJ NMODOBJ

Paul drove his car

A0,drive01,A1… 1.865

FP

FPA

FA

FG

drive.01: drive a vehicle A0: driver A1: vehicle

drive.02: cause to move A0: driver A1: things in motion

The proposed model combines these types of factors

The highest scoring assignment is returned by the proposed model

Page 17: A Structured Model for Joint Learning of   Argument Roles and Predicate Senses

Page 17

Dealing with global (non-local) features

Introduce the fundamental idea of [Kazama and Torisawa 2007]– Features are divided into local features and global features– Inference: N-best based approach

(1) Generate N-best assignments using only local features

(2) Obtain global features in the N-best assignments

(3) Select the argmax – Learning: train parameters with two margin constraints

• All: train parameters so as to ensure a sufficient margin using all features (both local features and global features)

• Local only: when the constraint All is satisfied, train parameters so as to ensure a sufficient margin using only local features

• K&T proposed a Margin-Perceptron Learning Algorithm

Page 18: A Structured Model for Joint Learning of   Argument Roles and Predicate Senses

Page 18

Inference and Learning Algorithm of the Proposed Model

Inference: generate N-best assignments for each predicate senseLearning: the online Passive-Aggressive Algorithm [Crammer 2006]

• The parameters are trained by solving the optimization problem used in PA with the two margin constraints: All (local + global) and Local only

(1) All (local + global)

margin

(2) Local only

margin

positiveother

positiveother

Page 19: A Structured Model for Joint Learning of   Argument Roles and Predicate Senses

Page 19

Results on the CoNLL-2009 ST Dataset (average)

feature selection

Overall (Sem. F1)

WSD (Acc.)

SRL (Lab. F1)

FP+FA no 79.17 89.65 72.20FP+FA+FPA no 79.58 89.78 72.74FP+FA+FG no 80.42 89.83 74.11ALL no 80.75 90.15 74.46Björkelund yes 80.80Zhao yes 80.47Meza-Ruiz no 77.46

sense FP

FPA

FG

FA

…role1 role2 roleN

The best performance is obtained by using the all factorsOur model achieved the competitive results with the top system in

the CoNLL-2009 Shared Task without any feature selection procedure

Page 20: A Structured Model for Joint Learning of   Argument Roles and Predicate Senses

Page 20

Results on the CoNLL-2009 ST Dataset (average)

feature selection

Overall (Sem. F1)

WSD (Acc.)

SRL (Lab. F1)

FP+FA no 79.17 89.65 72.20FP+FA+FPA no 79.58 89.78 72.74FP+FA+FG no 80.42 89.83 74.11ALL no 80.75 90.15 74.46Björkelund yes 80.80Zhao yes 80.47Meza-Ruiz no 77.46

sense FP

FPA

FG

FA

…role1 role2 roleN

By adding two types of factors FPA and FG, we obtained performance improvements in both tasks (predicate sense disambiguation and argument role labeling)

=> Succeeded in joint learning

Page 21: A Structured Model for Joint Learning of   Argument Roles and Predicate Senses

Page 21

Summary

We proposed a structured model that can capture two types of dependencies(1) Non-local dependencies among arguments

(2) Inter-dependencies between a predicate and its arguments

The proposed model achieved the competitive results with the state-of-the-art SRL systems without any feature selection procedure

By adding two types of factors, we obtained performance improvements on both predicate sense disambiguation and argument role labeling

=> succeeded in joint learning Future Work

– exploiting unlabeled data (unsupervised or semi-supervised predicate-argument structure analysis)