The MATLAB Notebook v1.5.2
EMGT378
Homework-6
By: Yuping Wang
E10.4 In this exercise we will modify the reference pattern P2 from Problem P10.3:
[pic], [pic]
i. Assume that the patterns occur with equal probability. Find the mean square error and sketch the contour plot.
ii. Find the maximum stable learning rate.
iii. Write a MATLAB M-file to implement the LMS algorithm for this problem. Take 40 steps of the algorithm for a stable learning rate. Use the zero vector as the initial guess. Sketch the trajectory on the contour plot.
iv. Take 40 steps of the algorithm after setting the initial values of both parameters to 1. Sketch the final decision boundary.
v. Compare the final parameters from parts (iii) and (iv). Explain your results.
Answer:
i. The mean square error can be expressed as [pic], and c, h and R can be calculated by:
[pic]
[pic]
so the mean square error performance index is
[pic]
[pic]
The eigen values and eigen vectors of Hessian matrix of F(x) is
A=2*[1 1;1 1];
[V,D] = eig (A)
V =
0.7071 0.7071
-0.7071 0.7071
D =
0 0
0 4.0000
Since all the eigen values are non-negative, and one is zero, there's a weak minimum in this problem. The surface and contour plot of F(x) are shown in Fig.1 and Fig.2.
ii. The maximum stable learning rate will satisfy [pic].
iii. Take [pic]. After 40 iterations we got [pic] (W1 in the code), and the trajectory is shown in Fig.2 as dotted symols.
clear
[X,Y] = meshgrid(-3 : .1 : 3);
F = 1 - 2 * (X + Y) + (X + Y).^2;
surf(X,Y,F)
title('Fig.1 Surface plot of Mean Square Error');
figure;
contour(X,Y,F)
title('Fig.2 Trajectory for different initial weights');
hold on;
%Initialize data
P = [1 -1;1 -1];
T = [1 -1];
alfa = 0.2;
W1 = [0;0];
W2 = [1;1];
for k = 1 : 2
if (k == 1)
W = W1;
else
W = W2;
end
plot(W(1), W(2),'r*')
text(-0.3,-0.3,'W_0 =(0,0)');
text(1,1.2,'W_0 =(1,1)');
%Train the network
for step = 1 : 20
for i = 1 : 2
a = purelin(W' * P(:,i));
e = T(i) - a;
W = W + 2 * alfa * e * P(:,i);
if (k == 1)
plot(W(1), W(2),'k.')
W1 = W;
else
plot(W(1), W(2),'b+')
W2 = W;
end
end
end
end
W1
W2
hold off;
W1 =
0.5000
0.5000
W2 =
0.5000
0.5000
[pic]
[pic]
iv. iv. For initial weights [1 1], after 40 iterations, we got [pic] (W2 in the code), and the trajectory is also shown in Fig.2 as "+" symols. And the decision boundary is given in Fig.3.
P = [1 -1;1 -1];
W = [0.5 0.5];
figure;
plot(P(1,1),P(2,1),'r+');
hold on;
plot(P(1,2),P(2,2),'r+');
%Decision Boundary
x = -2 : .1 : 2;
y =(-W(1)*x )/W(2);
plot(x,y);
axis([-2 2 -2 2]);
title('Fig.3 Decision boundary for E10.4');
hold off;
[pic]
v. From the figure above, we can see that LMS algorithm, unlike the perceptron learning rule, places the decision boundary as far from the patterns as possible.
Also, we see that for 2 different initial weights, the solution converges to the same value. However, if other arbitrary initial points are selected, we could get different solutions, since the performance index only has a weak minimum, i.e. the minimum is not unique. There's a line on which all the mean square error is minimum (zero).
E10.5 We again use the reference patterns and targets from Problem P10.3, and assume that they occur with equal probability. This time we want to train an ADALINE network with a bias. We now have three parameters to find: [pic].
i. Find the mean square error and the maximum stable learning rate.
ii. Write a MATLAB M-file to implement the LMS algorithm for this problem. Take 40 steps of the algorithm for a stable learning rate. Use the zero vector as the initial guess. Sketch the final decision boundary.
iii. Take 40 steps of the algorithm after setting the initial values of all parameters to 1. Sketch the final decision boundary.
iv. Compare the final parameters and the decision boundaries from parts (iii) and (iv). Explain your results.
Answer:
i. In this problem, [pic], [pic].
When bias is considered, we'll have [pic] and [pic], then c, h and R
can be calculated by:
[pic]
[pic]
So the mean square error performance index is
[pic]
[pic]
The eigen values and eigen vectors of Hessian matrix of F(x) is
A=2*[1 0 1;0 1 0;1 0 1];
[V,D] = eig (A)
V =
0 0.7071 0.7071
1.0000 0 0
0 -0.7071 0.7071
D =
2.0000 0 0
0 0 0
0 0 4.0000
The maximum stable learning rate will satisfy [pic].
ii. Take [pic]. After 40 iterations we got [pic]. The decision boundary is given in Fig.4.
clear
%Initialize data
P = [1 1;1 -1];
T = [1 -1];
alfa = 0.2;
W = [0;0];
b = 0;
%Train the network
for step = 1 : 20
for i = 1 : 2
a = purelin(W' * P(:,i) + b);
e = T(i) - a;
W = W + 2 * alfa * e * P(:,i);
b = b + 2 * alfa * e;
end
end
W
b
%Display in graph
figure;
plot(P(1,1),P(2,1),'r+');
hold on;
plot(P(1,2),P(2,2),'r+');
%Decision Boundary
x = -2 : .1 : 2;
y =(-W(1,1)*x - b)/W(2,1);
plot(x,y);
axis([-2 2 -2 2]);
title('Fig.4 Decision boundary for x_0=[0 0 0]');
hold off;
W =
-0.0000
1.0000
b =
-3.0058e-015
[pic]
iii. Take [pic]. After 40 iterations we got [pic]. The decision boundary is given in Fig.5.
W =
0.0000
1.0000
b =
6.9295e-015
[pic]
iv. Since LMS algorithm tends to find a decision boundary as far from the patterns as possible, the resulting biases for each initial point are close to zero, so that the boundary falls almost halfway between the two vectors.
Also, since the Hessian matrix has zero eigenvalue, there'd exist a weak minimum.
E11.7 For the network shown in the figure below, the initial weights and biases are chosen to be [pic].
The network transfer functions are [pic], [pic],
and the input / target pair is given to be [pic].
Perform one iteration of backpropagation with [pic].
[pic]
Answer:
First find the derivative of the transfer functions:
[pic], [pic]
Propagate the input through the network:
[pic]
[pic]
Find sensitivities:
[pic]
[pic]
Update the weights and biases:
[pic]
[pic]
[pic]
[pic]
E11.11 Write a MATLAB program to implement the backpropagation algorithm for the 1-2-1 network shown in Fig11.4. Choose the initial weights and biases to be random numbers uniformly distributed between –0.5 and 0.5 (using the MATALB function rand), and train the network to approximate the function
[pic] [pic].
Try several different values for the learning rate[pic], and use several different initial conditions. Discuss the convergence properties of the algorithm.
Answer:
Several cases were run using the following code, for learning rate [pic] = 0.2, 0.4, 0.6 respectively. For each [pic], two different initial conditins are used, and both the final approximation curve and some intermediate results are shown in the figures.
clear
%Initialize data
W1 = rand(2,1) - 0.5;
W2 = rand(1,2) - 0.5;
b1 = rand(2,1) - 0.5;
b2 = rand - 0.5;
a1 = zeros(2,1);
%Output the initial set
W1_0 = W1
b1_0 = b1
W2_0 = W2
b2_0 = b2
alfa = 0.2; %learning rate
tol = 0.001; %tol: tolerance
mse = 1; %mse: mean square error
iter = 0;
figure;
while (mse > tol)
mse = 0;
i = 0;
iter = iter + 1;
for P = -2 : .1 :2
i = i + 1;
T = 1 + sin(pi*P/8);
a1 = logsig(W1*P + b1);
a2 = purelin(W2*a1 + b2);
mse = mse + (T - a2)^2;
A(i) = a2;
dlogsig = [(1 - a1(1))* a1(1) 0;0 (1 - a1(2))* a1(2)];
s2 = -2 * (T - a2);
s1 = dlogsig * W2' * s2;
W2 = W2 - alfa * s2 * a1';
W1 = W1 - alfa * s1 * P;
b2 = b2 - alfa * s2;
b1 = b1 - alfa * s1;
end
P = -2 : .1 : 2;
if (mod(iter,10) == 0)
plot(P,A,'g:')
end
hold on;
end
%Display in graph
P = -2 : .1 : 2;
T = 1 + sin(pi*P/8);
%figure;
plot(P,T,'r-',P,A,'b+')
title('Fig6.1 learning rate = 0.2, initial set #1');
text(-1.8,1.7,'red ---- original function');
text(-1.8,1.6,'blue ---- approximation');
text(-1.8,1.5,'green ---- intermediate results');
xlabel('P'), ylabel('Target vs. output');
W1
b1
W2
b2
iter
Case 1.1 [pic] = 0.2, the solution converged after 318 iterations.
W1_0 =
0.1813
-0.1205
b1_0 =
0.2095
-0.0711
W2_0 =
0.3318 0.0028
b2_0 =
-0.1954
W1 =
0.7448
0.6206
b1 =
-0.0800
-0.1009
W2 =
1.5828 0.7052
b2 =
-0.0993
iter =
318
[pic]
Case 1.2 [pic] = 0.2, the solution converged after 237 iterations.
W1_0 =
-0.3103
-0.3066
b1_0 =
0.0417
-0.3491
W2_0 =
0.1822 -0.1972
b2_0 =
0.1979
W1 =
0.7909
-0.5887
b1 =
-0.2529
-0.4091
W2 =
1.4986 -0.7943
b2 =
0.6568
iter =
237
[pic]
Case 2.1 [pic] = 0.4, the solution converged after 13 iterations.
W1_0 =
-0.1216
0.3600
b1_0 =
-0.0034
0.3998
W2_0 =
0.3537 0.0936
b2_0 =
0.3216
W1 =
0.8861
1.2248
b1 =
-0.1264
0.2634
W2 =
0.7618 1.0548
b2 =
0.0852
iter =
13
[pic]
Case2.2 [pic] = 0.4, the solution converged after 315 iterations.
W1_0 =
0.1449
0.3180
b1_0 =
-0.2103
-0.1588
W2_0 =
0.1602 -0.1580
b2_0 =
0.0341
W1 =
0.7178
0.7196
b1 =
-0.0857
-0.1080
W2 =
1.2420 1.0103
b2 =
-0.0777
iter =
315
[pic]
Case 3.1 [pic] = 0.6, the solution converged after 303 iterations.
W1_0 =
0.2271
-0.1907
b1_0 =
-0.1296
0.2027
W2_0 =
0.3385 0.0681
b2_0 =
0.0466
W1 =
1.1146
0.9723
b1 =
0.1758
0.1192
W2 =
1.0237 0.7772
b2 =
0.0754
iter =
303
[pic]
Case 3.2 [pic] = 0.6, the solution converged after 331 iterations.
W1_0 =
-0.0551
0.1946
b1_0 =
0.4568
0.0226
W2_0 =
0.1213 0.2948
b2_0 =
0.3801
W1 =
1.3190
0.7729
b1 =
0.4491
-0.1514
W2 =
1.1491 0.6853
b2 =
0.0617
iter =
331
[pic]
Discussion:
From above cases, we see that:
1) The higher the learning rate is, the faster the iteration process converges.
2) When [pic] > 0.6, the solution becomes unstable. e.g. When [pic] = 0.8 is used, the converging process is very unstable, and takes a long time.
3) The initial weights and biases affects the converging speed and the final results.
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related searches
- minecraft 1.5.2 unblocked games
- minecraft 1 5 2 unblocked games
- minecraft 1 5 2 free download pc
- minecraft 1 5 2 jar file download
- minecraft 1 5 2 servers unblocked
- minecraft 1 5 2 unblocked free
- unblocked minecraft download 1 5 2 jar
- minecraft 1 5 2 free download
- minecraft 1 5 2 servers ip
- minecraft 1 5 2 unblocked download
- minecraft download free 1 5 2 unblocked
- minecraft 1 5 2 download