TensorFlow 深度學習快速上手班--自然語言處理應用

44
TensorFlow深度學習快速上班 四、然語處理應 By Mark Chang

Transcript of TensorFlow 深度學習快速上手班--自然語言處理應用

Page 1: TensorFlow 深度學習快速上手班--自然語言處理應用

TensorFlow深度學習快速上⼿手班������

四、⾃自然語⾔言處理應⽤用

By Mark Chang

Page 2: TensorFlow 深度學習快速上手班--自然語言處理應用

•  ⾃自然語⾔言處理簡介 •  Word2vec神經網路 •  語意運算實作

Page 3: TensorFlow 深度學習快速上手班--自然語言處理應用

⾃自然語⾔言處理簡介

Page 4: TensorFlow 深度學習快速上手班--自然語言處理應用

⾃自然語⾔言處理 •  ⾃自然語⾔言處理是⼈人⼯工智慧和語⾔言學領域的分⽀支

– 探討如何處理及運⽤用⾃自然語⾔言 •  ⾃自然語⾔言理解系統

– 把⾃自然語⾔言轉化為電腦易於處理的形式。 •  ⾃自然語⾔言⽣生成系統

– 把電腦程式數據轉化為⾃自然語⾔言。 •  https://zh.wikipedia.org/wiki/%E8%87%AA

%E7%84%B6%E8%AF%AD%E8%A8%80%E5%A4%84%E7%90%86���

Page 5: TensorFlow 深度學習快速上手班--自然語言處理應用

語意理解

https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf

Page 6: TensorFlow 深度學習快速上手班--自然語言處理應用

機器翻譯

http://arxiv.org/abs/1409.0473

Page 7: TensorFlow 深度學習快速上手班--自然語言處理應用

詩詞創作

http://emnlp2014.org/papers/pdf/EMNLP2014074.pdf

Page 8: TensorFlow 深度學習快速上手班--自然語言處理應用

影像標題產⽣生

http://arxiv.org/pdf/1411.4555v2.pdf

Page 9: TensorFlow 深度學習快速上手班--自然語言處理應用

影像內容問答

http://arxiv.org/pdf/1505.00468v6.pdf

Page 10: TensorFlow 深度學習快速上手班--自然語言處理應用

Word2vec神經網路

Page 11: TensorFlow 深度學習快速上手班--自然語言處理應用

⽂文字的語意

•  某個字的語意,可從它的上下⽂文得知

dog 和 cat 語意相近.

The dog run. A cat run. A dog sleep. The cat sleep. A dog bark. The cat meows.

Page 12: TensorFlow 深度學習快速上手班--自然語言處理應用

語意向量

The dog run. A cat run. A dog sleep. The cat sleep. A dog bark. The cat meows.

the a run sleep bark meow dog 1 2 2 2 1 0

cat 2 1 2 2 0 1

Page 13: TensorFlow 深度學習快速上手班--自然語言處理應用

語意向量

dog (1, 2,..., xn)

cat (2, 1,..., xn)

Car (0, 0,..., xn)

Page 14: TensorFlow 深度學習快速上手班--自然語言處理應用

語意向量相似度 •  A 和 B 的Cosine Similarity 為: A ·B

|A||B|

dog (a1, a2, ..., an)

cat (b1, b2, ..., bn)

dog 和 cat 的cosine similarity為:

a1b1 + a2b2 + ...+ anbnpa21 + a22 + ...+ a2n

pb21 + b22 + ...+ b2n

Page 15: TensorFlow 深度學習快速上手班--自然語言處理應用

語意向量加減運算

Woman + King - Man = Queen

Woman Queen

Man King

King - Man

King - Man

Page 16: TensorFlow 深度學習快速上手班--自然語言處理應用

語意向量維度太⼤大

(x1=the, x2 =a,..., xn)

dog

語意向量的維度等於總字彙量

x1

x2

x3

x4

xn ...

Page 17: TensorFlow 深度學習快速上手班--自然語言處理應用

Word2vec神經網路

dog

One-Hot Encoding

word2vec 神經網路

壓縮過的語意向量

1.2

0.7

0.5

1

0

0

0

Page 18: TensorFlow 深度學習快速上手班--自然語言處理應用

One-Hot Encoding

dog cat run fly 1

Page 19: TensorFlow 深度學習快速上手班--自然語言處理應用

Initialize Weights

dog

cat run

fly

dog

cat run

fly

W =

2

664

w11 w12 w13

w21 w22 w23

w31 w32 w33

w31 w32 w43

3

775V =

2

664

v11 v12 v13v21 v22 v23v31 v32 v33v31 v32 v43

3

775

Page 20: TensorFlow 深度學習快速上手班--自然語言處理應用

把語意向量壓縮

dog

高維度

低維度

v11

v12

v13

v11

v12

v13

v11

v12

v13

Page 21: TensorFlow 深度學習快速上手班--自然語言處理應用

Compressed Vectors

dog cat run fly

v11

v12

v13

v21

v22

v23

w31

w32

w33

w41

w42

w43

dog

cat run

fly

dog

cat run

fly

Page 22: TensorFlow 深度學習快速上手班--自然語言處理應用

Context Word dog 1

v11

v12

v13

v11

v12

v13 run

w31

w32

w33

dog

cat run

fly dog cat run fly

1

1 + e�V1W3⇡ 1

V1 ·W3 = v11w31 + v12w32 + v13w33

Page 23: TensorFlow 深度學習快速上手班--自然語言處理應用

Context Word cat

v11

v12

v13

v21

v22

v23 run

w31

w32

w33

dog cat run fly

V2 ·W3 = v21w31 + v22w32 + v23w33

dog cat run fly

1

1 + e�V2W3⇡ 1

Page 24: TensorFlow 深度學習快速上手班--自然語言處理應用

Non-context Word dog 1

v11

v12

v13

v11

v12

v13

fly

w41

w42

w43

V1 ·W4 = v11w41 + v12w42 + v13w43

1

1 + e�V1W4⇡ 0

dog cat run fly

dog cat run

fly

Page 25: TensorFlow 深度學習快速上手班--自然語言處理應用

Non-context Word

cat 1

v11

v12

v13

v21

v22

v23

w41

w42

w43

V2 ·W4 = v21w41 + v22w42 + v23w43

dog cat run

fly

dog cat run

fly

fly

1

1 + e�V2W4⇡ 0

Page 26: TensorFlow 深度學習快速上手班--自然語言處理應用

Result

dog cat run

fly

dog cat run fly

v11

v12

v13

v21

v22

v23

w31

w32

w33

w41

w42

w43

dog

cat run

fly

Page 27: TensorFlow 深度學習快速上手班--自然語言處理應用

語意運算實作

Page 28: TensorFlow 深度學習快速上手班--自然語言處理應用

語意運算實作 https://github.com/ckmarkoh/ntc_deeplearning_tensorflow/blob/master/sec4/semantics.ipynb

Page 29: TensorFlow 深度學習快速上手班--自然語言處理應用

訓練資料 anarchism originated as a term of abuse first used against early working class radicals including the diggers of the english revolution and the sans culottes of the french revolution whilst the term is still used in a pejorative way to describe any act that used violent means to destroy the organization of society it has also been taken up as a positive label by self defined anarchists the word anarchism is derived from the greek without archons ruler chief king anarchism as a political philosophy is the belief that rulers are unnecessary and should be abolished although there are differing interpretations of what this means anarchism also refers to related social movements that advocate the elimination of authoritarian institutions particularly the state the word anarchy as most anarchists use it does not imply chaos nihilism or anomie but rather a harmonious anti authoritarian society in place of what

Page 30: TensorFlow 深度學習快速上手班--自然語言處理應用

前處理 anarchism originated as a term of abuse first used against early working class radicals including the diggers of the english revolution and the sans culottes of the french revolution whilst the term is still used in a pejorative way to describe any act that used violent means to destroy the organization of society it has also been taken up ….

[‘anarchism’, ‘originated’, ‘as’, ‘a’, ‘term’, ‘of’, ‘abuse’, ‘first’, ‘used’, ‘against’, ‘early’, ‘working’, ‘class’, ‘radicals’, ‘including’, ‘the’, ‘diggers’, ‘of’, ‘the’, ‘english’, ‘revolution’, ‘and’, ‘the’, ‘sans’, ‘culottes’, ‘of’, ‘the’, ‘french’, ‘revolution’, ‘whilst’, ‘the’, ‘term’, ‘is’, ‘still’, ‘used’, ‘in’, ‘a’, ‘pejorative’, ‘way’, ‘to’, ‘describe’, ‘any’, ‘act’, ‘that’, ‘used’, ‘violent’, ‘means’, ‘to’, ‘destroy’, ‘the’... ]

Page 31: TensorFlow 深度學習快速上手班--自然語言處理應用

前處理

‘the’, ‘english’, ‘revolution’, ‘and’, ‘the’, ‘sans’, UNK, 'of', 'the', 'french', 'revolution’…

1, 103, 855, 3, 1, 15068, 0, 2, 1, 151, 855, …

‘the’, ‘english’, ‘revolution’, ‘and’, ‘the’, ‘sans’, ‘culottes’, 'of', 'the', 'french', 'revolution’…

‘the’, ‘english’, ‘revolution’, ‘and’, ‘the’, ‘sans’, ‘culottes’, 'of', 'the', 'french', 'revolution’…

字典外的字,用UNK代替。

將字轉換成字典內的代碼。

根據詞頻, 轉換成字典

{“UNK”: 0, “the”: 1, “of”: 2, “and”: 3, “one”: 4, “in”: 5, “a”: 6, “to”: 7, “zero”: 8, “nine”: 9, .... }

# 字典大小 vocabulary_size = 50000

Page 32: TensorFlow 深度學習快速上手班--自然語言處理應用

前處理 5239, 3084, 12, 6, 195, 2, 3137, 46, 59, 156, 128, 742, 477, 10572, 134, 1, 27549, 2, 1, 103, 855, 3, 1, 15068, 0, 2, 1, 151, 855, …

input output

3084 5239

3084 12

12 3084

12 6

6 12

6 195

195 6

195 2

3084 5239

word2vec

Page 33: TensorFlow 深度學習快速上手班--自然語言處理應用

前處理

5239, 3084, 12, 6, 195, 2, 3137, 46, 59, 156, 128, 742, 477, 10572, 134, 1, 27549, 2, 1, 103, 855, 3, 1, 15068, 0, 2, 1, 151, 855, …

generate_batch(batch_size=8, num_skips=2, skip_window=1)

batch size

input 3084 3084 12 12 6 6 195 195

output 5239 12 3084 6 12 195 6 2

num_skips

batch_size

skip_window=1

Page 34: TensorFlow 深度學習快速上手班--自然語言處理應用

Computational Graph train_inputs = tf.placeholder(tf.int32, shape=[batch_size]) train_labels = tf.placeholder(tf.int32, shape=[batch_size, 1]) with tf.device('/cpu:0'):

embeddings = tf.Variable( tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0))

embed = tf.nn.embedding_lookup(embeddings, train_inputs) nce_weights = tf.Variable(

tf.truncated_normal([vocabulary_size, embedding_size], stddev=1.0 / math.sqrt(embedding_size))) nce_biases = tf.Variable(tf.zeros([vocabulary_size])) loss = tf.reduce_mean( tf.nn.nce_loss(nce_weights, nce_biases, embed, train_labels, num_sampled, vocabulary_size))

optimizer = tf.train.GradientDescentOptimizer(1.0).minimize(loss)

Page 35: TensorFlow 深度學習快速上手班--自然語言處理應用

Device with tf.device('/cpu:0’)

在CPU上執行以下定義的Computational Graph

由於Tensorflow未支援 embedding_lookup 在GPU上執行,故需令它在CPU上執行。

Page 36: TensorFlow 深度學習快速上手班--自然語言處理應用

Inputs & Outputs

word2vec

train_inputs = tf.placeholder(tf.int32, shape=[batch_size]) train_labels = tf.placeholder(tf.int32, shape=[batch_size, 1])

train_inputs 3084

3084

12

12

6

6

195

195

train_labels 5239

12

3084

6

12

195

6

2

Page 37: TensorFlow 深度學習快速上手班--自然語言處理應用

Embedding Lookup embeddings = tf.Variable(tf.random_uniform([vocabulary_size,

embedding_size], -1.0, 1.0)) embed = tf.nn.embedding_lookup(embeddings, train_inputs)

train_inputs 2

embeddings

embedding_lookup

Page 38: TensorFlow 深度學習快速上手班--自然語言處理應用

NCE Weights •  NCE: Noise Contrastive Estimation

nce_weights = tf.Variable( tf.truncated_normal([vocabulary_size,

embedding_size], stddev=1.0 / math.sqrt(embedding_size) ))

nce_biases = tf.Variable( tf.zeros([vocabulary_size]) )

nce_weights

nce_biases

Page 39: TensorFlow 深度學習快速上手班--自然語言處理應用

NCE Loss loss = tf.reduce_mean(

tf.nn.nce_loss(nce_weights, nce_biases, embed, train_labels, num_sampled, vocabulary_size))

v11

v12

v13

v21

v22

v23

w31

w32

w33

1

1 + e�V2W3⇡ 1

v11

v12

v13

v21

v22

v23

w41

w42

w43

1

1 + e�V2W4⇡ 0

Positive Negative

cost = log(1

1 + e

�vT

I

wpos

) +X

neg

log(1� 1

1 + e

�vT

I

wneg

)

Page 40: TensorFlow 深度學習快速上手班--自然語言處理應用

Train feed_dict = {train_inputs: batch_inputs,

train_labels: batch_labels} _, loss_val = session.run([optimizer, loss], feed_dict=feed_dict)

loss_val

batch_inputs 3084

3084

12

12

6

6

195

195

batch_labels 5239

12

3084

6

12

195

6

2

Page 41: TensorFlow 深度學習快速上手班--自然語言處理應用

Result final_embeddings

array([[-0.02782757, -0.16879494, -0.06111901, ..., -0.25700757, -0.07137159, 0.0191142 ], [-0.00155336, -0.00928817, -0.0535327 , ..., -0.23261793, -0.13980433, 0.18055709], [ 0.02576068, -0.06805354, -0.03688766, ..., -0.15378961, 0.00459271, 0.0717089 ], ..., [ 0.01061165, -0.09820389, -0.09913248, ..., 0.00818674, -0.12992384, 0.05826835], [ 0.0849214 , -0.14137401, 0.09674817, ..., 0.04111136, -0.05420518, -0.01920278], [ 0.08318492, -0.08202577, 0.11284919, ..., 0.03887166, 0.01556483, 0.12496017]], dtype=float32)

Page 42: TensorFlow 深度學習快速上手班--自然語言處理應用

Visualization

Page 43: TensorFlow 深度學習快速上手班--自然語言處理應用

Most Similar Words def get_most_similar(word, top=10): wid = dictionary.get(word,-1)

result = np.dot(final_embeddings[wid:wid+1,:],final_embeddings.T) result = result [0].argsort().tolist() result.reverse() for idx in result [:10]: print(reverse_dictionary[idx])

get_most_similar("one")

one six two four seven three ...

Page 44: TensorFlow 深度學習快速上手班--自然語言處理應用

講師資訊

•  Email: ckmarkoh at gmail dot com •  Blog: http://cpmarkchang.logdown.com •  Github: https://github.com/ckmarkoh

Mark Chang

•  Facebook: https://www.facebook.com/ckmarkoh.chang •  Slideshare: http://www.slideshare.net/ckmarkohchang •  Linkedin:

https://www.linkedin.com/pub/mark-chang/85/25b/847

44