INSTITUTE OF COMPUTING TECHNOLOGY Bagging-based System Combination for Domain Adaptation Linfeng...

Bagging-based System Combination for Domain

Adaptation

Linfeng Song, Haitao Mi, Yajuan Lü and Qun Liu

Institute of Computing Technology

Chinese Academy of Sciences

INSTITUTE OF COMPUTING TECHNOLOGY

An Example

Initial MT system

An Example

Development setA:90% B:10%

Initial MT system Tuned MT system that fits domain A

The translation styles of A and B

are quite different

An Example

Test setA:10% B:90%

An Example

Test setA:10% B:90%

The translation style fits A, but we mainly want to translate B

Traditional Methods

Monolingual data with domain annotation

Traditional Methods

Monolingual data with domain annotation

Domain recognizer

Traditional Methods

Bilingual training data

Traditional Methods

Domain recognizer

training data : domain A

training data : domain B

Traditional Methods

Domain recognizer

training data : domain A

training data : domain B

MT system domain A

MT system domain B

Traditional Methods

Test set

Traditional Methods

Domain recognizer

Test set

Test set domain A

Test set domain B

Traditional Methods

The translation result

MT system domain A

MT system domain B

Test set domain A

Test set domain B

The translation result domain A

The translation result domain B

The merits

Simple and effective

Fits Human’s intuition

The drawbacks

Classification Error (CE) Especially for unsupervised methods

Supervised methods can make CE low, yet requiring annotation data limits its usage

Our motivation

Jump out of the alley of doing adaptation directly

Statistics methods (such as Bagging) can help.

The general framework of Bagging

Preliminary

General framework of Bagging

Training set D

Training set D1 Training set D2 Training set D3 ……

C2 C3 ……

C1 C2 C3 ……

Test sample

C1 C2 C3 ……

Test sample

Result of C1 Result of C2 Result of C3 ……

Voting result

Our method

Training

A,A,A,B,B

Suppose there is a development set

For simplicity, there are only 5 sentences, 3 belong A, 2 belong B

Training

A,A,A,B,B

A,B,B,B,B

A,A,B,B,B

A,A,A,B,B

A,A,A,A,B

……

We bootstrap N new development

Training

A,A,A,B,B

A,B,B,B,B

A,A,B,B,B

A,A,A,B,B

A,A,A,A,B

MT system-1

……

MT system-2

MT system-3

MT system-4

MT system-5

……

For each set, a subsystem is tuned

Decoding For simplicity, Suppose only 2 subsystem has

been tuned

Subsystem-1W:<-0.8,0.2>

Decoding

Now a sentence “A B” needs a translation

Decoding

a b; <0.2, 0.2>a c; <0.2, 0.3>

a b; <0.2, 0.2>a b; <0.1, 0.3>a d; <0.3, 0.4>

After translation, each system generate its N-

best candidate

Decoding

a b; <0.1, 0.2>a b; <0.1, 0.3>a c; <0.2, 0.3>a d; <0.3, 0.4>

Fuse these N-best lists and eliminate deductions

a b; <0.2, 0.2>a c; <0.2, 0.3>

a b; <0.2, 0.2>a b; <0.1, 0.3>a d; <0.3, 0.4>

Decoding

a b; <0.1, 0.2>a b; <0.1, 0.3>a c; <0.2, 0.3>a d; <0.3, 0.4>

a b; <0.2, 0.2>a c; <0.2, 0.3>

a b; <0.2, 0.2>a b; <0.1, 0.3>a d; <0.3, 0.4>

Candidates are identical only if their target strings

and feature values are entirely equal

Decoding

Calculate the voting score

a b; <0.2, 0.2>a b; <0.1, 0.3>a c; <0.2, 0.3>a d; <0.3, 0.4>

ttcfeatcscorefinal

a b; <0.2, 0.2>; -0.16a b; <0.1, 0.3>; +0.04a c; <0.2, 0.3>; -0.1a d; <0.3, 0.4>; -0.18

S represent the number of subsystems

Decoding

The one with the highest score

a b; <0.2, 0.2>a b; <0.1, 0.3>a c; <0.2, 0.3>a d; <0.3, 0.4>

a b; <0.2, 0.2>; -0.16a b; <0.1, 0.3>; +0.04a c; <0.2, 0.3>; -0.1a d; <0.3, 0.4>; -0.18

ttcfeatcscorefinal

Decoding

The one with the highest score

a b; <0.2, 0.2>a b; <0.1, 0.3>a c; <0.2, 0.3>a d; <0.3, 0.4>

a b; <0.2, 0.2>; -0.16a b; <0.1, 0.3>; +0.04a c; <0.2, 0.3>; -0.1a d; <0.3, 0.4>; -0.18

Since subsystems are different copies of the same model and share unique training

data, calibration is unnecessary

ttcfeatcscorefinal

Experiments

Basic Setups

Data: NTCIR9 Chinese-English patent corpus 1k sentence pairs as development set Another 1k pairs as test set The remains are used for training

System: hierarchical phrase based model

Alignment: GIZA++ grow-diag-final

Effectiveness : Show and Prove

Tune 30 subsystems using Bagging

Tune 30 subsystems with random initial weight

Evaluate the fusion results of the first N (N=5,10, 15, 20, 30) subsystems of both and compare

Results: 1-best

1 5 10 15 20 3031.00

31.7331.8

31.08 31.11 31.1331.17

31.23 31.2

baggingrandom

Number of subsystem

Results: 1-best

1 5 10 15 20 3031.00

31.7331.8

31.08 31.11 31.1331.17

31.23 31.2

baggingrandom

Number of subsystem

Results: Oracle

1 5 10 15 20 3036.00

42.2742.52 42.74 42.96

38.3538.67 38.82 39.04 39.25

baggingrandom

Number of subsystem

Results: Oracle

1 5 10 15 20 3036.00

42.2742.52 42.74 42.96

38.3538.67 38.82 39.04 39.25

baggingrandom

Number of subsystem

Compare with traditional methods

Evaluate a supervised method For tackling data sparsity only operate on

development set and test set

Evaluate a unsupervised method Similar to Yamada (2007) To avoid data sparsity, only LM specific

Results

baseline bagging supervise unsupervise31.00

1-best

Conclusions

Propose a bagging-based method to address multi-domain translation problem.

Experiments shows that: Bagging is effective for domain adaptation

problem Our method surpass baseline explicitly, and is

even better than some traditional methods.

Thank you for listeningAnd any questions?

INSTITUTE OF COMPUTING TECHNOLOGY Bagging-based System Combination for Domain Adaptation Linfeng...

Documents

Transcript of INSTITUTE OF COMPUTING TECHNOLOGY Bagging-based System Combination for Domain Adaptation Linfeng...

On Design of an Effective AI Agent for StarCraft · 2014-11-04 · Poznan University of Technology Faculty of Computing Institute of Computing Science On Design of an Effective AI

Cloud Computing woensdag 20 november 2013. Wat is Cloud Computing ? Cloud Computing.

Computing for the Masses 为人民计算 Zhiwei Xu 徐志伟 Information Science Advisory Committee, NSFC Institute of Computing Technology (ICT) Chinese Academy of Sciences.

Department of Electrical Engineering Indian Institute of ...home.iitk.ac.in/~ynsingh/seminars/MobComp.pdf · EE/ACES, IIT Kanpur ... Issues in mobile computing networks • Actual

Daniel A. G. Manzato and Nelson L. S. da Fonseca Institute of Computing, State University of Campinas Campinas, Brazil speaker: 吳麟佑.

Building a Fault-Aware Computing Environment for High End Computing Zhiling Lan Illinois Institute of Technology (Dept. of CS) In collaboration with Xian-He.

แนวทางการให้บริการ Cloud Computing (Cloud Service Provider ... · นิยาม Cloud Computing National Institute of Standard and Technology

Master Thesis v101 - Institute for Computing and ... · Master ThesisMaster Thesis “Risico-analyse van het Landelijk Schakelpunt, hart van de informatievoorziening in de zorg.”

Rekenen - Institute for Computing and Information Sciences · 2017-08-30 · Rekenen noodzakelijke voorkennis voor alle -studenten 1. natuurlijke getallen 2 2. rationale getallen

Software Engineering Issues for Ubiquitous Computing Author: Gregory D. Abowd, Georgia Institute of Technology CSCI 599 Week 4 Paper 3 September 18 2001.

Computing Periods 2018 - Junhee Cho · 2019. 1. 13. · Computing Periods Junhee Cho Advisor: Martin Ziegler A dissertation submitted to the faculty of Korea Advanced Institute of

1. Mobile Computing - Introduction Mobile Computing Lecture

INDEX Cloud Computing CLOUD COMPUTING

Cloud-Sicherheit: Schutzmaßnahmen unter der Lupe · DESffSANS INSTITUTE 1 Cloud-Sicherheit: Schutzmaßnahmen unter der Lupe Kurzfassung Die Nutzung von Cloud Computing-Services nimmt

Cloud computing, Grid Computing, Virtualization

Photodissociation and Photoionization Mechanisms in Lanthanide-based Fluorinated β-diketonate MOCVD Precursors Jiangchao CHEN, Robert J. WITTE, Yajuan.

Open Research Challenges in Service-Oriented Computing Schahram Dustdar Distributed Systems Group Institute of Information Systems TU Wien

Review and Analysis of China Workshop on Machine Translation …nlp.ict.ac.cn/~hengyu/papers/amta.pdf · Institute of Computing Technology, Chinese Academy of Sciences 2 University

Ninghui Sun Institute of Computing Technology Chinese Academy of Sciences May 18th, 2006. Beijing, ITER Supercomputer in China: Dawning’s Experience.

Future of Private Banking - CFA Institute · Nieuwe technologie voor de fin. markt High Performance Computing AI / Machine Learning ... Omnichannel insight, advice & guidance digital