Backpropagation

Machine Learning

Srihari

Backpropagation

Sargur Srihari

1

Machine Learning

Srihari

Topics in Backpropagation

1. Forward Propagation 2. Loss Function and Gradient Descent 3. Computing derivatives using chain rule 4. Computational graph for backpropagation 5. Backprop algorithm 6. The Jacobian matrix

2

Machine Learning

Srihari

A neural network with one hidden layer

Augmented network

No. of weights in w:

T=(D+1)M+(M+1)K =M(D+K+1)+K

D input variables x1,.., xD M hidden unit activations

D

aj =

w(ji1)xi

+

w(1) j0

where

j = 1,..,M

i =1

Hidden unit activation functions z j =h(aj)

K output activations

M

ak =

wk(i2)xi

+

w(2) k0

where

k = 1,..,K

i =1

Output activation functions

yk =(ak)

yk(x,w)

=

M

wk(j2)h

j =1

D

w

x (1)

ji

i

i =1

+

w(1) j0

+

w(2) k0

3

Machine Learning

Srihari

Matrix Multiplication: Forward Propagation

? Each layer is a function of layer that preceded it

? First layer is given by z =h (W(1)T x + b(1)) ? Second layer is y = (W(2)T x + b(2)) ? Note that W is a matrix rather than a vector

? Example with D=3, M=3

x=[x1,x2,x3]T

w

=

W (1) 1

W (2) 1

= =

W11W12W13 T W11W12W13 T

,W2(1) ,W2(2)

= =

W21W22W23 T W21W22W23 T

,W3(1) ,W3(2)

= =

W31W32W33 T W31W32W33 T

First Network layer

Network layer output

In matrix multiplication notation

4

Machine Learning

x

Srihari

Loss and Regularization

y=f (x,w)

( ) E = 1 N

N

Ei

i=1

f (x(i),w),ti

Loss Ei

Forward:

y

+

Backward:

Gradient of

Ei+R

R(W)

5

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download