論文紹介:Value iteration networks(チームラボ勉強会)
-
Upload
ryo-yamamoto -
Category
Technology
-
view
68 -
download
0
Transcript of 論文紹介:Value iteration networks(チームラボ勉強会)
Value Iteration Networks
Value Iteration NetworksAviv Tamar, Yi Wu, Garrett Thomas, Sergey Levine and Pieter Abbeel @ UC BerkeleyNIPS 2016 2017/03/09
2
Qhttps://www.slideshare.net/yamaryox/20160421-70945023
3
Q Q
Q
Q4
Qreactive
5
CNN
6
Value Iteration Networks
7
s x, y s (s) a r
( (s), a ) 8
(s)
maxmax...
Value Iteration Value Iteration
Value Iteration Networks10
(s)
maxmax...
(s)
maxmax...
CNNCNN
Value Iteration Networks
(s) CNN
Conv
max& softmax13
13
Value Iteration Networks
CNNBack-Propagation14
15
Grid-World
Mars Rover Navigation
Continuous Control
Grid-World8x8, 16x16, 28x28 3x3CNN conv1=3x3x150, conv2=3x3x1 10, 20, 36 5000 7
CNNFCN(NN)17
Grid-World 18
Grid-World VIN 19
Grid-World 20
Grid-World 21
Mars Rover Navigation128x128108CNN16x16Conv(5x5x6), MaxPool(4x4), Conv(3x3x12), MaxPool(2x2), Conv(3x3x150), Conv(3x3x1) 10,000 7
22
Mars Rover Navigation 23
Mars Rover Navigation VIN84.8%
CNN90.3%
VIN24
Continuous Control(x, y, vx, vy)16x163x3NNCNN Conv1(3x3x150), Conv2(3x3x1)20040
25
Continuous Control
(s) CNN
Conv
max(5x5)& x 326
26
Continuous Control CNN27
Continuous Control
28
29
Value Iteration NetworksEnd-to-End
CNN
30
Web
VIN
31