Neural Network + Tensorflow 入門講座

Click here to load reader

download Neural Network + Tensorflow 入門講座

of 187

Transcript of Neural Network + Tensorflow 入門講座

PowerPoint

Agenda Y = WX + b 2 -- (Activator)Activator -- SoftMax

Loss Function Gradient Descent TensorFlow -- TensorFlow TensorFlowTensorFlowTensorFlow

TensorFlowTensorFlowConvolutional Neural NetworkRecurrent Neural NetworkCNNTensorFlowRNN/LSTMTensorFlow

Blue Brain ProjectHenry MarkramI wanted to model the brain because we didnt understand it. The best way to figure out how something works is to try to build it from scratch.

in vivo

in vitro

in silico

http://seedmagazine.com/content/article/out_of_the_blue/P1/

http://seedmagazine.com/content/article/out_of_the_blue/P1/

EU Human Brain Project

2013101012

EU Human Brain Project21ICT

EU Human Brain Project EC

vs vs 150Henry Markram

NIH Human Connectome Project

http://www.nih.gov/news/health/sep2010/nimh-15.htm20101040

NIH Human Connectome Project

http://www.humanconnectomeproject.org/

US BRAIN InitiativeNIH124,500

2014/06/05http://1.usa.gov/1pIIhvx

1mm5610060

http://www.tmd.ac.jp/artsci/biol/textlife/neuron.htmhttp://bit.ly/1qR2Dmq

http://goo.gl/0lbzRgC. Elegance3028,0001987

6070Hubel Wiesel

David H. Hubel

Torsten Wiesel1960

Donald O. HebbThe Organization of Behavior. 1949

https://goo.gl/2HsDwK

Donald O. Hebb1950

http://kitsuon-kaizen.en.que.jp/hori/108.htm

http://blogs.yahoo.co.jp/yuyamichidori/11068629.html

Y = WX + b

10

10

A B C

A B > C A B C > 0 A B < C A B C < 0

3+3+2=72+1+2=5 - = 2 > 1()

X1=1X2=1X3=1X4=1X5=1X6=1231322

0+0+2=22+1+2=5 - = -3 < 1()

X2, X4X1=1X2=0X3=1X4=0X5=1X6=1231322

3+3+0=62+0+2=4 - = 2 > 1()

X3, X5X1=1X2=X3=X4=X5=X6=1231322

XX1,X2,..X601W1X1+W2X2+W3X3+ W4X4+W5X5+W6X6+ b > 0

WiABWi

:b CX1X2X3X4X5X6W2=3W3=-1W4=3W5=2W6=-2W1=-2A B C > 0A + B+ C > 0

W1X1+W2X2+W3X3+ W4X4+W5X5+W6X6+ b =(-2)x1+3x0+(-1)x1+3x0+2x1+(-2)x1-1 =-2-2+2-2-1= -50

X1=1X2=X3=X4=X5=X6=1-23-132-2

Wi[-2,3,-1,3,2,-2] b=-1Xi=[1,1,0,1,0,1]

X1

X2

X3

X4

X5

X6

W1X1+W2X2+W3X3+W4X4+W5X5+W6X6+b > 0 ? W=[W1,W2,W3,W4,W5,W6] b X=[X1,X2,X3,X4,X5,X6]

X1

X2

X3

X4

X5

X6

W1X1+W2X2+W3X3+W4X4+W5X5+W6X6+b > 0 ? W=[W1,W2,W3,W4,W5,W6] b X=[X1,X2,X3,X4,X5,X6]X1X2X3X4X5X6

XXTXT =

W=[W1,W2,W3,W4,W5,W6] X=[X1,X2,X3,X4,X5,X6] XTWXT = [W1,W2,W3,W4,W5,W6]

= W1X1+W2X2+W3X3+W4X4+W5X5+W6X6

X1X2X3X4X5X6

X1

X2

X3

X4

X5

X6

WXT + b > 0 ? W=[W1,W2,W3,W4,W5,W6] b X=[X1,X2,X3,X4,X5,X6]XWbWX+b > 0 y = ax + b > 0

X1

X2

X3

123 X=[X1,X2,X3] X=[1,0,1]

[2,-3,4] -412+(-3)0+41-4=2>0

2 [-4,1,-5] 52(-4)+10+(-5)1+5=-40

X1

X2

X3

1W1XT + b1 > 02W2XT + b2 > 03W3XT + b3 > 0 X=[X1,X2,X3] W1=[W11,W12,W13] b1 W2=[W21,W22,W23] b2 W3=[W31,W32,W33] b31 W1XT + b1 = W11X1+W12X2+W13X3+b12 W2XT + b2 = W21X1+W22X2+W23X3+b23W3XT + b3 = W31X1+W32X2+W33X3+b3

X1

X2

X3

1W1XT + b1 > 02W2XT + b2 > 03W3XT + b3 > 0 X=[X1,X2,X3] W1=[W11,W12,W13] b1 W2=[W21,W22,W23] b2 W3=[W31,W32,W33] b3W11X1+W12X2+W13X3+b1W21X1+W22X2+W23X3+b2W31X1+W32X2+W33X3+b3()W11,W12,W13 X1 b1W21,W22,W23 X2 + b2 =W31,W32,W33 X3 b3

WXT + b

X1

X2

X3

X4

X5

X6

123456 X=[X1,X2,X3,X4,X5,X6]

X1

X2

X3

X4

X5

X6

1W1XT + b1 > 02W2XT + b2 > 03W3XT + b3 > 04W4XT + b4 > 05W5XT + b5 > 06W6XT + b6 > 0 X=[X1,X2,X3,X4,X5,X6] W1=[W11,W12,W13,W14,W15,W16] b1 W2=[W21,W22,W23,W24,W25,W26] b2 W3=[W31,W32,W33,W34,W35,W36] b3 W4=[W41,W42,W43,W44,W45,W46] b4 W5=[W51,W52,W53,W54,W55,W56] b5 W6=[W61,W62,W63,W64,W65,W66] b6

X1

X2

X3

X4

X5

X6

1W1XT + b1 > 02W2XT + b2 > 03W3XT + b3 > 04W4XT + b4 > 05W5XT + b5 > 06W6XT + b6 > 0 X=[X1,X2,X3,X4,X5,X6]W11X1+W12X2+W13X3+ W14X4+W15X5+W16X6+b1W21X1+W22X2+W23X3+ W24X4+W25X5+W26X6+b2W31X1+W32X2+W33X3+ W34X4+W35X5+W36X6+b3W41X1+W42X2+W43X3+ W44X4+W45X5+W46X6+b4W51X1+W52X2+W53X3+ W54X4+W55X5+W56X6+b5W61X1+W62X2+W63X3+ W64X4+W65X5+W66X6+b6

W11, W12, W13, W14, W15, W16 X1 b1W21, W22, W23, W24, W25, W26 X2 b2W31, W32, W33, W34, W35, W36 X3 + b3 =W41, W42, W43, W44, W45, W46 X4 b4W51, W52, W53, W54, W55, W56 X5 b5W61, W62, W63, W64, W65, W66 X6 b6W11X1+W12X2+W13X3+W14X4+W15X5+W16X6+b1W21X1+W22X2+W23X3+W24X4+W25X5+W26X6+b2W31X1+W32X2+W33X3+W34X4+W35X5+W36X6+b3W41X1+W42X2+W43X3+W44X4+W45X5+W46X6+b4W51X1+W52X2+W53X3+W54X4+W55X5+W56X6+b5W61X1+W62X2+W63X3+W64X4+W65X5+W66X6+b6()

WXT + b

Y = WXT + b

Y = WXT + b Y = WXT + b Y = WXT + b XYbX=[X1,X2,...Xn]T Y = WX + b

Y = WX + b W Y=y, X=x, W=a, b=b y = ax + b Y=y, X=x, W=2, b=3 y = 2x + 3 Y=y, X=x, W=4, b=0 y = 4x

Y = WX + b W Y=y, X=[x1,x2]T W=[w1,w2] b=b0 y = w1x1+w2x2+b0 Y=y, X=[c,d]T W=[m,n] b=b y = mc+nd+b Y=y, X=[1,0]T W=[2,3] b=4 y = 21+30+4 = 6 Y=y, X=[x1,x2,x3]T W=[w1,w2,w3] b=b0 y = w1x1+w2x2+w3x3+b0 Y=y, X=[x1,x2,x3,x4]T W=[w1,w2,w3,w4] b=b0 y = w1x1+w2x2+w3x3+w4x4+b0

Y = WX + b W 2x222[ [a,b], [c,d] ]3x3 33[ [d,e,f], [g,h,i], [j,k.l] ] ....Y=[y1,y2]T, X=[x1,x2]TW= [ [2,3], [4,5] ]b=[1,2]T Y = WX + b y1 = 2x1+3x2+1 y2 = 3x1+5x2+2Y=[y1,y2,y3]T, X=[x1,x2,x3]TW= [ [2,3,-1], [4,-5,1], [1,2,3] ] b=[1,2,3]T Y = WX + b y1 = 2x1+3x2-x3+1 y2 = 4x1+5x2+x3+2y3 = x1+2x2+3x3+3

Y = WX + b W W6x6(66) Y = WX + b n10010,000W n m Y = WX + bn Yn = WnX + bn

Y=WXT+b Y=XW+b X=[X1,X2,...,Xn] Y=WXT+b X=[X1,X2,...,Xn]T Y=WX+bW XT, b X=[X1,X2,...,Xn] Y=XW+b WWW=WTbb=bTTensorFlow Y=XW+b TensorFlow

-- (Activator)Activator -- SoftMax

-- (Activator)

WbWX + b WX + b > 0 WX + b (WX + b ) 1 0 0 x 0

(x) =

sigmoid (logistic ) x=0sigmoidlogisticx01 01

ReLU (rectified linear unit)rectifier ReLU 0sigmoid01ReLUx=0Softplusx0x

tanh sigmoid1/2-1+1tanh

(WX + b =(-2)x1+3x0+(-1)x1+3x0+2x1+(-2)x1 -1 = -2-2+2-2-1= -5

(WX + b ) =sigmoid(-5) 0

W[-2,3,-1,3,2,-2] b=-1 X = [1,0,1,0,1,1]T(x) = sigmoid(x) X1=1X2=0X3=1X4=0X5=1X6=1-23-132-2

0 ( WX + b )

WX + b =(-2)x1+3x1+(-1)x0+3x1+2x0+(-2)x1 -1 =-2+3+3-2= 2

(WX + b ) =ReLU(2) =2

X1=1X2=X3=X4=X5=X6=1-23-132-2

W [-2,3,-1,3,2,-2] b=-1 X = [1,1,0,1,0,1]T(x) = ReLU(x)2 ( WX + b )

X1

X2

X3

1 (W1X + b1) 2 (W2X + b2)3 (W3X + b3) X=[X1,X2,X3] T Yi = (WiX + bi)

W1=[W11,W12,W13] b1 W2=[W21,W22,W23] b2 W3=[W31,W32,W33] b3

W11,W12,W13 X1 b1W21,W22,W23 X2 + b2 W31,W32,W33 X3 b3

W11X1+W12X2+W13X3+b1(W21X1+W22X2+W23X3+b2 =(W31X1+W32X2+W33X3+b3)

( WX + b )

ActivatorSoftMaxSoftmax

activator0910 091010 softmax10softmax

softmax

0 1 23 4 5 6 7 8 90 0 0 1 0 0 0 0 0 0

softmax

0 1 23 4 5 6 7 8 90 0 0 0 0 1 0 0 0 0

softmax

0 1 23 4 5 6 7 8 90 0 0 0 0 1 0 0 0 0

softmax softmaxsoftmac1softmax3 0.7531 - 0.75 = 0.25 3(softmax)0.25softmax1

One-Hot-Vectorsoftmaxnsoftmax1010One-Hot-Vector One-Hot-VectornOne-Hot-Vector

Loss Function Gradient Descent

BackpropagationWbRumelhart(1986) Backpropagation

Loss Function () Loss Function Cost Function Loss FunctionLoss FunctionLoss FunctionLoss Function

Gradient Descent Gradient Descent Loss Function JWbJJJWb Gradient Descent J/W J/b J(WX+b)Gradient Descent

WX+b) W, X, bWActivator x a, b ax+byy=ax+b

y=ax+b

a, b y=ax+bab

a,b i (x(i), y(i))y=ax+b (x(i), y(i)) (ax(i)+b) y(i) a, b

y=ax+b a,b (x(i), y(i))(ax(i)+b) y(i)( ax(i)+b), y(i) )

Loss quadratic cost

:: a, b

:

:y(x) = ax + ba, bya, ba, b

J(a, b)a,b ab

10J(0,1) J(0, 1)

100

Gradient descentGradient descent TensorFlow Gradient descent

:

Gradient descent

xx

If is too small, gradient descent can be slow.If is too large, gradient descent can overshoot the minimum. It may fail to converge, or even diverge.Gradient descentGradient descent

Learning Rate

Gradient descentgradient descent

Gradient Descent CWkbl

quadratic cost Cross Entropy

10J(0,1)

107

01

J(0,1)

108

http://arxiv.org/pdf/1210.0811v2.pdfgradient descent method

Softmax0910

H(p,q)p(x)log(q(x))N x

p(x) q(x)P(1)=2/3,p(2)=1/4,p(3)=5/24,p(4)=5/24q(1)=1/8,q(2)=1/8, q(3)=5/8,q(4)=1/8

x=12/3log(1/8) = -2/3log8 = -2log2x=21/4log(1/8) = -1/4log(8) = -3/4log2x=35/24log(5/8)= 5/24log(5) - 15/24log2x=45/24log(1/8)= -5/24log(8) = -15/24log2 -(48log2+18log2-5log5+15log2+15log2)/24=-(96log2-5log5)/24

H(p,q) = (96log2-5log5)/24p(x)log(q(x)) x

p(x) q(x)P(1)=0,p(2)=1,p(3)=0,p(4)=0q(1)=2/3,q(2)=1/4,q(3)=5/24,q(4)=5/24

x=10log2/3x=21log1/4x=30log5/24x=40log5/24 log1/4 = -log4 = -2log2H1(p,q) = 2log2

H1p(x)log(q(x)) xp(2) = 1 p

p(x) q(x)P(1)=0,p(2)=1,p(3)=0,p(4)=0q(1)=1/4,q(2)=2/3,q(3)=5/24,q(4)=5/24

x=10log1/4x=21log2/3x=30log5/24x=40log5/24 log2/3 = log2 log3

H2(p,q) = log3-log2

H2p(x)log(q(x)) xp(2) = 1 p

p(x) q(x)P(1)=0,p(2)=1,p(3)=0,p(4)=0q(1)=5/24,q(2)=2/3,q(3)=1/4,q(4)=5/24

x=10log5/24x=21log2/3x=30log1/4x=40log5/24log2/3

H3(p,q) = log3-log2

H3p(x)log(q(x)) xp(2) = 1 p

p(x) q(x)P(1)=0,p(2)=1,p(3)=0,p(4)=0q(1)=0,q(2)=1,q(3)=0,q(4)=0

x=10log0 = 0x=21log1 = 0x=30log0 = 0x=40log0 = 0

H5(p,q) = 0

H4p(x)log(q(x)) xp(2) = 1 p

p(x) one-hot-value p(x)log(q(x))N x- { i log(q(i)) }H(p,q) =0 q(i) 1 log(q(i)) 0 q(i) 0 log(q(i)) - q(i) 1 log(q(i)) 0q(i)H(p,q)log(q(i))N =

pqp(x)=1 xq

H(p, q) = - { p(x)log(q(x)) + (1-p(x))log(1-q(x)) }xP(x),q(x) 0 p(x) 1 0 q(x) 1 0 1-p(x) 1 0 1-q(x) 1

P(x) 0 log(q(x)) 0 p(x)log(q(x)) 0(1p(x)) 0 log(1-q(x)) 0 (1-p(x))log(1-q(x)) 0 p(x)log(q(x)) + (1-p(x))log(1-q(x)) 0

H(p, q) 0

H(p, q) = - { p(x)log(q(x)) + (1-p(x))log(1-q(x)) }x

p(x)p(x)

H(p, q) = - { p(x)log(q(x)) + (1-p(x))log(1-q(x)) }x

p(x)=0p(x)=1 H(p) = - p(x)log(p(x))x ShannonEntoropyBinary Entoropy

p(x) q(x)P(1)=2/3,p(2)=1/4,p(3)=5/24,p(4)=5/24q(1)=1/8,q(2)=1/8, q(3)=5/8,q(4)=1/8

x=12/3log(1/8)+(1-2/3)log(1-1/8) = 2/3log(1/8)+1/3log(7/8)x=21/4log(1/8)+(1-1/4)log(1-1/8) = 1/4log(1/8)+3/4log(7/8)x=35/24log(5/8)+(1-5/24)log(1-5/8) = 5/24log(5/6)+19/24log(3/8)x=45/24log(1/8)+(1-5/24)log(1-1/8) = 5/24log(1/8)+19/24log(7/8) (-2/3log8+1/3(log7-log8)) + (-1/4log8+3/4(log7-log8)) + (5/24(log5-log6)+19/24(log3-log8) + (-5/24log8+19/24(log7-log8)) = ( -16log8+8log7-8log8-6log8+18log7-18log8 + 5log5-5log6+19log3-19log8-5log8+19log7-19log8)/24 = ( -(16+8+6+18+19+5+19)log8+(8+18+19)log7-5log6+19log3)/24 H(p,q) = - ( -91log8 + 45log7 5(log2+log3) +19log3)/24 = (278log2-45log7+14log3)/24

p(x) q(x)P(1)=0,p(2)=1,p(3)=0,p(4)=0q(1)=2/3,q(2)=1/4,q(3)=5/24,q(4)=5/24

x=10log2/3+1log(1-2/3) = log(1/3) = -log3x=21log1/4+0log(1-1/4) = log(1/4) = -2log2x=30log5/24+1log(1-5/24) = log(19/24) = log19 log24x=40log5/24+1log(1-5/24) = log(19/24) = log19 log24 -log3-2log2+log19-log24+log19-log24 = -log3-2log2+2log19-2(log3+log8)= -3log3+2log19 8log2

H1(p,q) = 3log3+8log2-2log19

H1

p(x) q(x)P(1)=0,p(2)=1,p(3)=0,p(4)=0q(1)=2/3,q(2)=1/4,q(3)=5/24,q(4)=5/24

H1

H(p, q) = - { p(x)log(q(x)) + (1-p(x))log(1-q(x)) }xp(x)One Hot Valueq(x)

p(x) q(x)P(1)=0,p(2)=1,p(3)=0,p(4)=0q(1)=1/4,q(2)=2/3,q(3)=5/24,q(4)=5/24

x=10log1/4+1log(1-1/4) = log(3/4) = log3 log4x=21log2/3+0log(1-2/3) = log2/3 = log2 log3x=30log5/24+1log(1-5/24) = log(19/24) = log19 log24x=40log5/24+1log(1-5/24) = log(19/24) = log19 log24 log3log4+log2-log3+log19-log24+log19-log24 = log4+2log19-2(log3+log8)= -log3-8log4+2log19

H2(p,q) = log3+8log2-2log19

H1-H2 = (3log3+8log2-2log19) (log3+8log2-2log19) = 2log3 > 0

H2

p(x) q(x)P(1)=0,p(2)=1,p(3)=0,p(4)=0q(1)=5/24,q(2)=2/3,q(3)=1/4,q(4)=5/24

x=10log5/24+1log(1-5/24) = log(19/24) = log19 log24x=21log2/3+0log(1-2/3) = log(2/3) = log2 log3x=30log1/4+1log(1-1/4) = log(3/4) = log3 log4x=40log5/24+1log(1-5/24) = log(19/24) = log19 log24H2

H3(p,q) = H2(p,q) =log3+8log2-2log19

H3

p(x) q(x)P(1)=0,p(2)=1,p(3)=0,p(4)=0q(1)=0,q(2)=3/4,q(3)=1/8,q(4)=1/8

x=10log0+1log(1- 0) = 0x=21log3/4+0log(1-3/4) = log(3/4) = log3 log4x=30log1/8+1log(1-1/8) = log(7/8) = log7 log8x=40log1/8+1log(1-1/8) = log(7/8) = log7 log8 log3-log4+log7-log8+log7-log8 = log3+2log7-8log2

H4(p,q) = 8log2-2log7-log3

H3-H4 = (log3+8log2-2log19) (8log2-2log7-log3) = 2log3+ 2log7 -2log19 > 0

9x49 > 19x19H4

p(x) q(x)P(1)=0,p(2)=1,p(3)=0,p(4)=0q(1)=0,q(2)=1,q(3)=0,q(4)=0

x=10log0+1log(1-0) = 0x=21log1+0log(1-1) = 0x=30log0+1log(1-0) = 0x=40log0+1log(1-0) = 0

H5(p,q) = 0

H5

TensorFlow --

TensorFlowTensorFlow

( WX + b )

XWb

X[X1,X2,X3]T W 3x3b[b1,b2,b3]T

( XW + b )

XWb

X[X1,X2,X3] W 3x3b[b1,b2,b3] TensorFlow( XW + b )

TensorFlow

XWb

TensorFlow

TensorFlowX,W,bXWb

( XW + b ) TensorFlow( XW + b )

433

XWb

[X1,X2,X3 ,X4]43[b1,b2,b3]

255

XWb

[X1,X2]25[b1,b2,b3,b4,b5]nmXnWnmb

TensorFlowTensorFlow

X1

X2

X3

X4

H1

H2

H3

M1

M2

M3

M4

Y1

Y2Full ConnectFull Connect

X1

X2

X3

X4

H1

H2

H3

M1

M2

M3

M4

Y1

Y2

X1

X2

X3

X4

H1

H2

H3

M1

M2

M3

M4

Y1

Y2 X H M YH WH bH HM WM bM M

Y WY bY Y

XWb

Wb

X1

X2

X3

X4

H1

H2

H3

M1

M2

M3

M4

Y1

Y2 X H M YH WH bH HM WM bM M

Y WY bY Y

XWb

Wb

Wb

H:3 W43 b:3M:4 W3x4 b:4Y:2 W4x2 b:2 X H M YX44

TensorFlow

784->8

XWb

Wb

:15 WH784x15 bH15:10 WO15x10 bO10 784 784 TensorFlow

https://www.tensorflow.org/

Logit Layer, ReLu Layer

MatMulBiassAdd

Logit LayerActivatorSoftmaxReLu LayerActivatorReLu

GitHub

TensorFlow

LSTM(sigmoid), tanh

Chris Olah "Understanding LSTM Networks"http://colah.github.io/posts/2015-08-Understanding-LSTMs/

-- TensorFlow

433XWb

[X1,X2,X3 ,X4]43[b1,b2,b3]

433XWb

[X1,X2,X3 ,X4]43[b1,b2,b3]TensorFlow

433XWb

[X1,X2,X3 ,X4]43 [b1,b2,b3]333

255XWb

[X1,X2] [b1,b2,b3,b4,b5]555

TensorFlowTensorFlowTensorFlown

TensorFloworderdegreenPython 2

t = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

212t[i,j] 3t[i,j,k]

Python0(s = 4831v = [1.1, 2.2, 3.3]2 ()m = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]33-t = [[[2], [4], [6]], [[8], [10], [12]], [[14], [16], [18]]]nn-Tensor....

TensorFlow

RankShapeDimension numberExample0[]0-DA 0-D tensor. A scalar.1[D0]1-DA 1-D tensor with shape [5].2[D0, D1]2-DA 2-D tensor with shape [3, 4].3[D0, D1, D2]3-DA 3-D tensor with shape [1, 4, 3].n[D0, D1, ... Dn]n-DA tensor with shape [D0, D1, ... Dn].

TensorFlowTensorFlowTensorFlowTensorFlow

TensorFlowTensorFlow

TensorFlowXWb

5Wb3(XW+b)

WbBack PropagationVariableXWb

PythonTensorFlow tf TensorFlownnNeural Nettf.Variable tf.Variabletf.matmul + tf.nn.relu(XW+b)

3weights = tf.Variable( ... ) # weight bias = tf.Variable( ... ) # bias# XW+b# relu# images# hidden1hidden1 = tf.nn.relu(tf.matmul(images, weights) + biases)XW+b images, weights, bias, hidden1VariableVariable

# weights1 = tf.Variable( ... ) bias1 = tf.Variable( ... ) hidden1 = tf.nn.relu(tf.matmul(images, weights1) + biases1)

# weights2 = tf.Variable( ... ) bias2 = tf.Variable( ... )hidden2 = tf.nn.relu(tf.matmul( hidden1, weights2) + biases2)hidden1

Wimagesb

Wb

weight1 bias1 weight2 bias2relu reluimages hidden1 hidden2# weights1 = tf.Variable( ... ) bias1 = tf.Variable( ... ) hidden1 = tf.nn.relu(tf.matmul(images, weights1) + biases1)

# weights2 = tf.Variable( ... ) bias2 = tf.Variable( ... )hidden2 = tf.nn.relu(tf.matmul( hidden1, weights2) + biases2)TensorFlow

# weights1 = tf.Variable( ... ) bias1 = tf.Variable( ... ) hidden1 = tf.nn.relu(tf.matmul(images, weights1) + biases1)

# weights2 = tf.Variable( ... ) bias2 = tf.Variable( ... )hidden2 = tf.nn.relu(tf.matmul( hidden1, weights2) + biases2)

# weights3 = tf.Variable( ... ) bias3 = tf.Variable( ... )logit = tf.matmul( hidden2, weights3) + biases3)

Activator

TensorFlowTensorFlowVariableTensorFlow

TensorFlowTensorFlowTensorFlowTensorFlowTensorFlowSaveRestore

TensorFlowHelper

tf.random_normal tf.zeros HelperHelpershape# Create two variables. weights = tf.Variable(tf.random_normal([784,200], stddev=0.35), name="weights") biases = tf.Variable(tf.zeros([200]), name="biases")

# weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35), name=weights) biases = tf.Variable(tf.zeros([200]), name="biases") ... # init_op = tf.initialize_all_variables()

# with tf.Session() as sess: # Run the init operation. sess.run(init_op) ... # ... Sessionrun[784,200][200]

# 784x200weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35), name="weights")

# weightsw2w2 = tf.Variable(weights.initialized_value(), name="w2")

# weights twice twice = tf.Variable(weights.initialized_value() * 2.0, name="w_twice")

save# v1 = tf.Variable(..., name="v1") v2 = tf.Variable(..., name="v2") ...

# init_opinit_op = tf.initialize_all_variables() # save,restoresavesaver = tf.train.Saver()

# # savewith tf.Session() as sess: sess.run(init_op) # ... # save save_path = saver.save(sess, "/tmp/model.ckpt") print("Model saved in file: %s" % save_path)

restore# v1 = tf.Variable(..., name="v1") v2 = tf.Variable(..., name="v2") ... ...# save,restoresavesaver = tf.train.Saver()

# restore# with tf.Session() as sess: # save saver.restore(sess, "/tmp/model.ckpt") print("Model restored" ) # ...

Gradient Descent()2.

GradientDescentOptimizerTensorFlowGradientDescent GradientDescentOptimizerOptimizerTensorFlow

1loss = tf.reduce_mean(tf.square(y - y_data)) optimizer = tf.train.GradientDescentOptimizer(0.5) train = optimizer.minimize(loss) (y-y_data)21/2m

C =

y_ = tf.placeholder("float", [None, 10])cross_entropy = -tf.reduce_sum(y_ * tf.log(y))optimizer = tf.train.GradientDescentOptimizer(0.01)train = optimizer.minimize(cross_entropy)(y_*log(y))

C =

TensorFlow

....W = tf.Variable(tf.random_uniform([1], -1.0, 1.0)) b = tf.Variable(tf.zeros([1]))y = W * x_data + b

# Minimize the mean squared errors. loss = tf.reduce_mean(tf.square(y - y_data)) optimizer = tf.train.GradientDescentOptimizer(0.5) train = optimizer.minimize(loss)

# Before starting, initialize the variables. We will 'run' this first. init = tf.initialize_all_variables()

# Launch the graph. sess = tf.Session() sess.run(init)

# Fit the line. for step in xrange(201): sess.run(train) if step % 20 == 0: print(step, sess.run(W), sess.run(b))

....W = tf.Variable(tf.random_uniform([1], -1.0, 1.0)) b = tf.Variable(tf.zeros([1]))y = W * x_data + b

# Minimize the mean squared errors. loss = tf.reduce_mean(tf.square(y - y_data)) optimizer = tf.train.GradientDescentOptimizer(0.5) train = optimizer.minimize(loss)

# Before starting, initialize the variables. We will 'run' this first. init = tf.initialize_all_variables()

# Launch the graph. sess = tf.Session() sess.run(init)

# Fit the line. for step in xrange(201): sess.run(train) if step % 20 == 0: print(step, sess.run(W), sess.run(b))

....# Create the modelx = tf.placeholder("float", [None, 784])W = tf.Variable(tf.zeros([784, 10]))b = tf.Variable(tf.zeros([10]))y = tf.nn.softmax(tf.matmul(x, W) + b)

# Define loss and optimizery_ = tf.placeholder("float", [None, 10])cross_entropy = -tf.reduce_sum(y_ * tf.log(y))train_step = tf.train.GradientDescentOptimizer(0.01) .minimize(cross_entropy)

# Traintf.initialize_all_variables().run()for i in range(1000): batch_xs, batch_ys = mnist.train.next_batch(100) train_step.run({x: batch_xs, y_: batch_ys})

https://goo.gl/MwscZO

....# Create the modelx = tf.placeholder("float", [None, 784])W = tf.Variable(tf.zeros([784, 10]))b = tf.Variable(tf.zeros([10]))y = tf.nn.softmax(tf.matmul(x, W) + b)

# Define loss and optimizery_ = tf.placeholder("float", [None, 10])cross_entropy = -tf.reduce_sum(y_ * tf.log(y))train_step = tf.train.GradientDescentOptimizer(0.01)\ .minimize(cross_entropy)

# Traintf.initialize_all_variables().run()for i in range(1000): batch_xs, batch_ys = mnist.train.next_batch(100) train_step.run({x: batch_xs, y_: batch_ys})

https://goo.gl/MwscZO

Convolutional NN TensorFlow (conv(X,W,...)+b) (XW+b)

Filter W: [32,32,1,32]

Neural Net 44 19:00~Google

(Google)

323-24 GoogleGCP Next (Google Cloud Platform Next)