Learning in Multi-Layer Perceptrons - Back-Propagation
Learning in Multi-Layer Perceptrons - Back-Propagation
Neural Computation : Lecture 7
? John A. Bullinaria, 2015
1.
Linear Separability and the Need for More Layers
2.
Notation for Multi-Layer Networks
3.
Multi-Layer Perceptrons (MLPs)
4.
Learning in Multi-Layer Perceptrons
5.
Choosing Appropriate Activation and Cost Functions
6.
Deriving the Back-Propagation Algorithm
7.
Further Practical Considerations for Training MLPs
(8) How Many Hidden Layers and Hidden Units?
(9) Different Learning Rates for Different Layers?
Linear Separability and the Need for More Layers
We have already shown that it is not possible to find weights which enable Single Layer
Perceptrons to deal with non-linearly separable problems like XOR:
in2
XOR
?
e.g.
OR
AND
in1
However, Multi-Layer Perceptrons (MLPs) are able to cope with non-linearly separable
problems. Historically, the problem was that there were no known learning algorithms
for training MLPs. Fortunately, it is now known to be quite straightforward.
L7-2
Notation for Multi-Layer Networks
Dealing with multi-layer networks is easy if a sensible notation is adopted. We simply
need another label (n) to tell us which layer in the network we are dealing with:
Network
Layer n
outi(n?1) wij(n)
j
out
(n )
j
=f
(n )
?
?
(n?1) (n )
?¡Æ outi wij ?
? i
?
€
(n?1) (n)
Each unit j in layer n receives activations outi wij from the previous layer of
(n)
processing units and sends activations out j to the next layer of units.
L7-3
Multi-Layer Perceptrons (MLPs)
Conventionally, the input layer is layer 0, and when we talk of an N layer network we
mean there are N layers of weights and N non-input layers of processing units. Thus a
two layer Multi-Layer Perceptron takes the form:
noutputs
(2)
out k(2) = f (2) (¡Æ out (1)
j w jk )
j
(2)
w jk
nhidden
€
(1)
wij
ninputs
€
(1)
out (1)
(¡Æ outi(0)wij(1) )
j = f
i
outi(0) = ini
It is clear how we can add in further layers, though for most practical purposes two
layers will be sufficient. Note that there is nothing stopping us from having different
activation functions f(n)(x) for different layers, or even different units within a layer.
L7-4
The Need For Non-Linearity
We have noted before that if we have a regression problem with non-binary network
outputs, then it is appropriate to have a linear output activation function. So why not
simply use linear activation functions on the hidden layers as well?
With activation functions f(n)(x) at layer n, the outputs of a two-layer MLP are
outk(2)
= f
(2)
?
?
?
?
? (2) ?
(1)
(2)
(2)
(1)
(1)
?? ¡Æ out j .w jk ?? = f ?? ¡Æ f ? ¡Æ ini wij ? .w jk ??
? i
?
? j
?
? j
?
so if the hidden layer activations are linear, i.e. f(1)(x) = x, this simplifies to
outk(2)
= f
?
?
??
(1) (2) ?
?
?
? ¡Æ ini .? ¡Æ wij w jk ? ?
? j
??
? i
(2) ?
(1) (2 )
But this is equivalent to a single layer network with weights w ik = ¡Æ j wij w jk and we
know that such a network cannot deal with non-linearly separable problems.
L7-5
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- module 3 artificial neural networks
- multi layer networks and backpropagation algorithm
- questions bank
- derivation of backpropagation
- a tutorial on backward propagation through time bptt in
- my attempt to understand the backpropagation algorithm for
- introduction to multi layer feed forward neural networks
- understanding belief propagation and its generalizations
- stock price prediction using back propagation neural
- backpropagation in multilayer perceptrons
Related searches
- learning in the workplace
- continuous learning in the workplace
- theories of learning in education
- continuous learning in business
- theories of learning in psychology
- concepts of learning in psychology
- adult learning in the workplace
- learning in the workplace articles
- promote learning in the workplace
- development and learning in organizations
- active learning in online courses
- blooms taxonomy learning in action