Transition-Based Dependency Parsing with Stack Long Short...

Transition-Based Dependency Parsing with Stack Long Short-Term Memory

Chris Dyer Minuguel Ballesteros

Wang Ling Austin Matthews Noah A.Smith

紹介する人：林（東工大）

2015/08/24 ACL読み会@すずかけ 1

Transition-Based Dependency Parsing with Stack Long Short-Term Memory

p  係り受け解析 p  Transition-based Parserの分類器の話

–  ニューラルネットをLSTMに拡張したら性能が良くなった

p  ポイント 1.  係り受け解析の種類：Transition-Based Parsing 2.  モデル：with Stack Long Short-Term Memory 3.  入力：Composition Functions –  ベースライン

•  分類器をSVMからNNへ [Chen&Manning 2014] –  Stanford Parserの実装 –  http://cs.stanford.edu/~danqi/papers/emnlp2014_slides.pdf

Transition-based parsing p  スタックに単語を積み分類器で次の操作を決定

– 今回はArc-Standardを採用 [Nivre 2004] – スタック，バッファ，操作履歴の三つ組 –  Shift : バッファからスタックに単語を積む –  Arc：スタックから単語を取り出しArcを張る

•  Left-Arc, Right-Arc (右にかかるか左にかかるか) + head-dependent間の関係

–  詳しくは他のスライドで •  能地さんのスライド「最近のTransition based parsing」 http://www.slideshare.net/nozyh/transition-based-parsing •  ベースラインのスライド http://cs.stanford.edu/~danqi/papers/emnlp2014_slides.pdf

モデル概要

１. Stack Long Short-Term Memory

スタックバッファ

操作履歴

アイディア

p  その時のParserのすべての状態を考慮したい 1.  スタック内とバッファ内のすべての単語情報 2.  すべての操作履歴 –  Stack Long Short-Term Memories

•  スタック内の状態 •  バッファ内の状態 •  操作履歴の状態

–  それぞれの状態をベクトルで表現する 3.  スタック内に存在するすべての依存関係 –  Composition Functions：後で

Long Short-Term Memories

p  Reccurent Neural Networks (RNN)の一種 –  隠れ層から隠れ層の結合が特徴（過去の履歴も利用） –  RNNは系列が長くなる場合，学習がうまくいかない場合がある –  RNNの隠れユニットを置き換えたもの

PFI得居さんのスライドから h*p://www.slideshare.net/beam2d/pfi-‐seminar-‐20141030rnn

LSTMを展開する

p  静的に見ることができる – 出力nは入力1…n-1を踏まえたもの

入力1

出力1

入力2 入力3 入力n

出力2 出力3 出力n

LSTMを展開&論文図

入力1

出力1

入力2 入力3 入力n

出力2 出力3 出力n

どことなく似ている

つまりスタック，バッファ，操作履歴の状態をLSTMで表現

1.モデル詳細

LSTMを使って表現

従来のLSTM(RNN)と異なる点

Control を pop

Control を push

p  スタックバッファ内操作 (Shiftの例) – スタック，バッファ内の操作自体はpopとpush – 操作履歴はpushのみ

単語がpopされた場合

p  スタック，バッファ内の中の状態が変更 –  popした単語の履歴はたどれない

•  使う出力層を明示的に示す必要がある：Stack pointer

単語がpopされた場合

p  スタック，バッファ内の中の状態が変更 –  popした単語の履歴はたどれない

•  使う出力層を明示的に示す必要がある：Stack pointer

POP 次の入力は？出力は？これまでの履歴は？

p  スタック，バッファの操作に応じた出力先 – 例1：単語x1がpopされた

•  単語x1を考慮しないy0にポインタを移動

Stack pointer

y1 Pop y0

p  例2：単語x2がpushされた –  単語x1のLSTMは無視し，単語x0を踏まえた（隠れ層が結合した）状態でy2にポインタを移動

–  このようにLSTM自体がスタックされていく •  Stack-LSTM

Stack pointer

Push y0

1.  “an”, “overhasty”がスタック内に保持されている状態 2.  “overhasty”がpop：stack pointer→”an”の出力層 3.  ”decision”がpush：stack pointer→”push”の出力層

–  “overhasty”と”decision”間に結合はなし ※ただし，操作履歴にはLeft-arc(amod)として保存

Stack pointer まとめ

2.入力

バッファとスタックの入力ベクトル

p  3つのベクトルを組み合せる –  word type (w)

•  Stanford Dependency treebank – ニューラル言語モデル (W_LM)

•  Strucured skip n-gram [Ling+ 2015] •  giga word corpus

–  POSタグ (t) •  pos-tagger

out-of-vocabulary-words

p  ニューラル言語モデルでは出現し，Parserの訓練データでは出現しない単語が存在 – その扱い方 (UNK) –  Singleton word typeの単語ベクトルをイテレーションごとに確率的(p=0.5)に他のSingleton word typeの単語ベクトルに置き換えつつ学習

Composition Functions

p  操作によって得られたsubtreeをベクトル化 –  Recursive Neural Networkの発想 [Socher 2014] –  head, modifier, relaitionの三組に変換 – それをボトムアップにベクトルを作成

•  操作履歴のS-LSTMの入力とする

3.出力

p  （大枠として）3層のニューラルネットワーク

3.出力

が最大となるようにバックプロパゲーションで学習

S-LSTMの出力 Stack:S Buffer:b Action:a

1層：S-LSTMの出力 2層：hidden layer (ReLU) 3層：softmax (操作)

ほしいもの：操作列

実験設定

p  NNの次元数やパラメータ – 直感に基づく：今後の課題の一つ

p  データ – 英語

•  Stanford Dependency treebank •  Stanford Tagger •  English Gigaword corpus

– 中国語 •  Penn Chinese Treebank •  Penn Chinese Treebankに付与されるPos •  Chinese Gigaword corpus

実験結果

ともによさげビームサーチしてもあんまり効果なし

p  素性を引いてみて検討 – 提案手法 [S-LSTM] –  - POS –  - Pretraining –  - Compotision Functions –  S-RNN [LSTMをRNNに置き換え] – ベースライン [Chen&Manning]

まとめ

p  既存の係り受け解析の手法より高性能 1.  係り受け解析の種類：Transition-Based Parsing 2.  モデル：with Stack Long Short-Term Memory 3.  入力：Token Embedding & OOVs, Compotition Functions

p  今後 – さまざまな拡張

•  Unsupervisedな係り受け解析 •  Neural Turing Machine

Transition-Based Dependency Parsing with Stack Long Short...

Documents

Transcript of Transition-Based Dependency Parsing with Stack Long Short...

LR Parsing - Aho

รหัสโครงการ 11p34c596 SEA Parsing BEST2009 – (BEST – … · รหัสโครงการ 11p34c596 วจีวิุภาคสมทร SEA Parsing

Incremental Dependency Parsing Based on Headed Context ...

Identifying Cascading Errors using Constraints in Dependency Parsing ( 2015ACL読み会＠すずかけ台 )

Composer dependency manager

[Study] Simple LR Parsing

Dependency Injection Frameworky

Dependency Parser - Kangwonleeck/NLP/dependency_parser.pdf · 2019. 11. 27. · Graph-based Dependency Parsing •Dependency structure를Graph(directed Tree)로표현 –V: nodes

Dependency injection, phemto

LR Parsing

BottomUp Parsing

Media Dependency Theory

Dependency Parsing Parsing Algorithms Peng.Huang peng.huangp@alibaba-inc.com.

Parsing - Eduardo San

End-to-end approaches to speech recognition and language ...klivescu/MLSLP2016/chorowski...O. Vinyals et al, “Grammar as a Foreign Language”, NIPS 2015 Dependency parsing •Desired

Bahasa Alamiah_teknik Parsing

STACK (Tumpukan) Pengertian Stack

Dependency Parser - Kangwoncs.kangwon.ac.kr/~leeck/NLP/dependency_parser.pdf · 2019-11-27 · Dependency Structure CFG • Dependency structure –Head-dependent relations –Functional

Non-projective Dependency Parsing using Spanning Tree Algorithm

Análisis Sintáctico (Parsing)