Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖...

31
Deep Learning intro. 2016.01.02.

Transcript of Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖...

Page 1: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢

Deep Learning intro.

π‘ π‘–π‘”π‘šπ‘Ž 𝜢

2016.01.02.

Page 2: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 2

Outline

Natural Language Processing (NLP)

Representation and Processing

Deep Learning Models

Page 3: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢

Natural Language Processing

Page 4: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 4

Natural Language Processing (NLP)

β€’ λ‹΅λ³€

β€’ 검색

β€’ μΆ”λ‘ 

β€’ λŒ€ν™”

언어이해 μ–Έμ–΄μƒμ„±μ‘μš©β€’ 지λŠ₯ν˜•λ‘œλ΄‡

β€’ 정보검색

β€’ κΈ°κ³„λ²ˆμ—­

β€’ λ¬Έμ„œμš”μ•½

β€’ 질문

β€’ 단어이해

β€’ μ˜λ―Έμ΄ν•΄

β€’ μ˜λ„νŒŒμ•…

Page 5: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢

Representation and Processing

Page 6: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 6

Representation in mathematics

<0.156, 0.421, 0.954, …>

<0.096, 0.510, 0.991, …>

<0.496, 0.951, 0.321, …>

<0.196, 0.851, 0.119, …>

<…, 0.486, 0.854, …>

<…, 0.751, 0.912, …>

<…, 0.123, 2.554, 5.124, …>

<…, 7.451, 21.45, 8.999>

<…, 1.109, 11.854, 0.456>

Real World Vector Space

https://www.google.com/imghp?hl=ko

Page 7: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 7

였리 vs. 토끼

Page 8: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 8

μœ„μž₯

Page 9: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 9

Neural Network for Human

https://uncyclopedia.kr/wiki/%EB%87%8C

Neural Network

Pattern recognition

Multi layer

Human: 10 layers

I see lion

Page 10: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 10

Neural Network

Vector representation

Pattern of layers

+ Learning

Page 11: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 11

Pattern of layers

Deep learning automatic pattern combination

Why we say deep ?

… … … … … …

…

Unit

layer

n

m

Connection link: (n x n) x (m-1)

Automatic combination

Page 12: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 12

How to use layers?

Input vector

Output real number or class (vector)

Vector representation β€œOne-hot”

Page 13: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 13

Vector representation

[Symbol]

Lion[Text representation] [One-hot representation]

<0, 0, 0, 0, 0, 1, 0, 0, 0, 0, …>

[Symbol representation]

<1.45, 75.12, 0.425, 0.953, …>

Page 14: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 14

Jung, DEEP LEARNING FOR KOREAN NLP

Page 15: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 15

How to define symbol to one-hot

Lion

Big cat

[Symbolic words]

<0, 0, 1, 0, 0>

<0, 1, 0, 0, 1>

[One-hot]

If it uses AND op., two words is non-match

∴ we need symbolic vector representation

Page 16: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 16

How to define symbol to one-hot

Lion

Big cat

TigerDog

Wolf

Mouse

∴ [Symbolic representation]

<0, 0, 1, 0, 0>

<0, 1, 0, 0, 1>

<1.45, 75.12, 0.425, 0.953, …>

<1.78, 61.11, 0.611, 2.011, …>

Use cosine similarity

[Symbolic vectors] (from NNLM)

Page 17: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 17

Neural Network Language Model

Feed-forward NN

parametric Estimator

overall parameter set πœƒ = (𝐢,𝑀)

one-hot representationβ€’ [0 1 0 0 0 0 0 0 0 0]

Lookup Tableβ€’ word embedding

Non-linear projectionβ€’ activation function

Normalize weightβ€’ softmax (length: 𝑛)

Page 18: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 18

Neural Network Language Model

maxπœƒ β†’ π‘™π‘œπ‘‘ π‘™π‘–π‘˜π‘’π‘™π‘–β„Žπ‘œπ‘œπ‘‘

𝐿 = maxπœƒ

1

𝑇 𝑑 π‘™π‘œπ‘”π‘“(𝑀𝑑, π‘€π‘‘βˆ’1, … , π‘€π‘‘βˆ’π‘›+1)

parametersβ€’ β„Ž: π‘‘β„Žπ‘’ π‘›π‘’π‘šπ‘π‘’π‘Ÿ π‘œπ‘“ β„Žπ‘–π‘‘π‘‘π‘’π‘› 𝑒𝑛𝑖𝑑𝑠

β€’ π‘š: π‘‘β„Žπ‘’ π‘›π‘’π‘šπ‘π‘’π‘Ÿ π‘œπ‘“ π‘“π‘’π‘Žπ‘‘π‘’π‘Ÿπ‘’π‘  π‘€π‘–π‘‘β„Ž π‘’π‘Žπ‘β„Ž π‘€π‘œπ‘Ÿπ‘‘

β€’ 𝑏: π‘‘β„Žπ‘’ π‘œπ‘’π‘‘π‘π‘’π‘‘ π‘π‘–π‘Žπ‘ π‘’π‘ 

β€’ 𝑑: π‘‘β„Žπ‘’ β„Žπ‘–π‘‘π‘‘π‘’π‘› π‘™π‘Žπ‘¦π‘’π‘Ÿ π‘π‘–π‘Žπ‘ π‘’π‘ 

β€’ π‘ˆ: β„Ž βˆ’ π‘‘π‘œ βˆ’ π‘œ π‘€π‘’π‘–π‘”β„Žπ‘‘π‘ 

β€’ π‘Š: 𝐼 βˆ’ π‘‘π‘œ βˆ’ π‘œ π‘€π‘’π‘–π‘”β„Žπ‘‘π‘ 

β€’ 𝐻: 𝐼 βˆ’ π‘‘π‘œ βˆ’ 𝐻 π‘€π‘’π‘–π‘”β„Žπ‘‘π‘ 

β€’ 𝐢:π‘€π‘œπ‘Ÿπ‘‘ π‘“π‘’π‘Žπ‘‘π‘’π‘Ÿπ‘’π‘  (π‘™π‘œπ‘œπ‘˜π‘’π‘ π‘‘π‘Žπ‘π‘™π‘’)

β€’ πœƒ = (𝑏, 𝑑,π‘Š, π‘ˆ,𝐻, 𝐢)

Page 19: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 19

NNLM for Korean

Leeck, λ”₯λŸ¬λ‹μ„μ΄μš©ν•œν•œκ΅­μ–΄μ˜μ‘΄κ΅¬λ¬ΈλΆ„μ„

Page 20: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢

Deep Learning Models

Page 21: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 21

Deep learning Models

β€œκ°•λŒ€μ£Όλ³€μ—μŠ€νƒ€λ²…μŠ€μœ„μΉ˜κ°€μ–΄λ””μ•Ό?”‒ κ°•λŒ€/NNG μ£Όλ³€/NNG 에/JX μŠ€νƒ€λ²…μŠ€/NNG …

Feed-forward Neural Network (FFNN)

π‘Šπ‘‘

Y

κ°•λŒ€

NNG

μ£Όλ³€

NNG

에

JX

FFNN:

1-FFNN 2-FFNN 3-FFNN

Page 22: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 22

Deep learning Models

β€œκ°•λŒ€μ£Όλ³€μ—μŠ€νƒ€λ²…μŠ€μœ„μΉ˜κ°€μ–΄λ””μ•Ό?”‒ π‘Œπ‘‘π‘’π‘₯𝑑 [κ°•λŒ€μ£Όλ³€μ—μŠ€νƒ€λ²…μŠ€μœ„μΉ˜], [μ–΄λ””]

β€’ π‘Œπ‘‘π‘Žπ‘”π‘  [ B I I I I ], [ B ]

Recurrent Neural Network (RNN)

π‘Šπ‘‘

Y

unfold κ°•λŒ€

B

μ£Όλ³€

I

에

I

μŠ€νƒ€λ²…μŠ€

I

μœ„μΉ˜

I

RNN

Page 23: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 23

Deep learning Models

β€œκ°•λŒ€μ£Όλ³€μ—μŠ€νƒ€λ²…μŠ€μœ„μΉ˜κ°€μ–΄λ””μ•Ό?”‒ π‘Œπ‘‘π‘’π‘₯𝑑 [κ°•λŒ€μ£Όλ³€μ—μŠ€νƒ€λ²…μŠ€μœ„μΉ˜], [μ–΄λ””]

β€’ π‘Œπ‘‘π‘Žπ‘”π‘  [ B I I I I ], [ B ]

Long Short-Term Memory RNN (LSTM-RNN)β€’ Using gate matrix (LSTM or GRU)

π‘Šπ‘‘

Y

unfold κ°•λŒ€

B

μ£Όλ³€

I

에

I

μŠ€νƒ€λ²…μŠ€

I

μœ„μΉ˜

I

LSTM-RNN

Page 24: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 24

Deep learning Models

β€œκ°•λŒ€μ£Όλ³€μ—μŠ€νƒ€λ²…μŠ€μœ„μΉ˜κ°€μ–΄λ””μ•Ό?”‒ π‘Œπ‘‘π‘’π‘₯𝑑 [κ°•λŒ€μ£Όλ³€μ—μŠ€νƒ€λ²…μŠ€μœ„μΉ˜], [μ–΄λ””]

β€’ π‘Œπ‘‘π‘Žπ‘”π‘  [ B I I I I ], [ B ]

LSTM-RNN CRF β€’ Using gate matrix (LSTM or GRU)

π‘Šπ‘‘

Y

unfold κ°•λŒ€

B

μ£Όλ³€

I

에

I

μŠ€νƒ€λ²…μŠ€

I

μœ„μΉ˜

I

LSTM-RNN

Viterbi or Beam search

Page 25: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 25

Deep learning Models

β€œκ°•λŒ€μ£Όλ³€μ—μŠ€νƒ€λ²…μŠ€μœ„μΉ˜κ°€μ–΄λ””μ•Ό?”‒ π‘Œπ‘‘π‘’π‘₯𝑑 [κ°•λŒ€μ£Όλ³€μ—μŠ€νƒ€λ²…μŠ€μœ„μΉ˜], [μ–΄λ””]

β€’ π‘Œπ‘‘π‘Žπ‘”π‘  [ B I I I I ], [ B ]

Bidirectional LSTM-RNN CRF (Bi-LSTM-RNN CRF)β€’ Using gate matrix (LSTM or GRU)

Viterbi or Beam search

κ°•λŒ€

B

μ£Όλ³€

I

에

I

μŠ€νƒ€λ²…μŠ€

I

μœ„μΉ˜

I

forward

backward

Page 26: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 26

Deep learning Models

Sequence-to-sequence model

Two different LSTM: Input/output sentence LSTM

Using the Shallow LSTM

Reverse input sentence

Training: Decoding & Rescoring

Page 27: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 27

Deep learning Models

Encoder-Decoder Architecture

Page 28: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 28

Pointer Networks

β€’ Seq2seq와 attention mechanism μ„κΈ°λ°˜μœΌλ‘œν•œλ”₯λŸ¬λ‹λͺ¨λΈ

β€’ μž…λ ₯μ—΄μ˜μœ„μΉ˜(인덱슀)λ₯ΌμΆœλ ₯μ—΄λ‘œν•˜λŠ”λͺ¨λΈ

β€’ X = {A:0, B:1, C:2, D:3, <EOS>:4}

β€’ Y = {3, 2, 0, 4}

A B C D <EOS> D C A <EOS>

Encoding Decoding

Deep learning Models

Page 29: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 29

Deep learning Models

Siamese Neural Network

Page 30: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 30

References

Jung, DEEP LEARNING FOR KOREAN NLP

Lee, λ”₯λŸ¬λ‹μ„μ΄μš©ν•œν•œκ΅­μ–΄μ˜μ‘΄κ΅¬λ¬ΈλΆ„μ„

Park, Point networks for Coreference Resolution

Park, Bi-LSTM-RNN CRF for Mention Detection

Page 31: Deep Learning intro. - Kangwoncs.kangwon.ac.kr/.../12_deeplearning_intro.pdfΒ Β· 2016-06-17Β Β· 𝑖 𝜢4 Natural Language Processing (NLP) β€’λ‹΅λ³€ ‒검색 β€’μΆ”λ‘  β€’λŒ€ν™”

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 31

QA

κ°μ‚¬ν•©λ‹ˆλ‹€.

λ°•μ²œμŒ, 졜수길, λ°•μ°¬λ―Ό, 졜재혁, 홍닀솔

π‘ π‘–π‘”π‘šπ‘Ž 𝜢 , κ°•μ›λŒ€ν•™κ΅

Email: [email protected]