Prior to
the session:
At the lab:
Exercise 1: Randomize the weights and biases. Cycle the network on each
of the input patterns in the training set and inspect the activations on the
output units. For each individual, record whether that individual is classified
as a Jet or a Shark by comparing the relative activation levels of the Jets and
Sharks units.
|
Name |
Initial Classification |
|
Robin |
|
|
Margaret |
|
|
Bill |
|
|
Janet |
|
|
Mike |
|
|
Alfred |
|
|
Joan |
|
|
Gerry |
|
|
Catherine |
|
|
Brett |
|
|
John |
|
|
Sandra |
|
|
Joshua |
|
|
Beth |
|
|
Bert |
|
|
Maria |
|
Exercise 2: Does the untrained network have a classification bias?
__________________________________________________________________________________________________________________________________________
Exercise 3:
Note: Train
for 5 (rather than 40 epochs) and report the results after 5 epochs
Use the graph tool to graph the output set error. Change the maximum
value that can be displayed in the graph from 1 to 10. Now, randomize the weights
and biases of the Jets and Sharks network and then train it for 5 epochs. What
is the error on the output set after 5 epochs?
_____________________________________________________________________
Exercise 4:
Note: Q 4 has been replaced with the
following:
After the network has been trained for 5 epochs, re-test it on the
patterns in the training set. Report which is the most
difficult pattern to learn, and explain why
In the following Table, write down the Classification
|
Name |
Correct Classification |
Actual Classification |
Output Value of winning unit |
|
Robin |
Jet |
|
|
|
Margaret |
Shark |
|
|
|
Bill |
Jet |
|
|
|
Janet |
Shark |
|
|
|
Mike |
Jet |
|
|
|
Alfred |
Shark |
|
|
|
Joan |
Jet |
|
|
|
Gerry |
Shark |
|
|
|
Catherine |
Jet |
|
|
|
Brett |
Shark |
|
|
|
John |
Jet |
|
|
|
Sandra |
Shark |
|
|
|
Joshua |
Jet |
|
|
|
Beth |
Shark |
|
|
|
Bert |
Jet |
|
|
|
Maria |
Shark |
|
|
_______________________________________________________________________________________________________________________________________________________________________________________________________________
To help you answer the next question, to start off with… tick the
appropriate characteristics of each individual.
|
Name |
Gang |
Age |
Education |
Marital Status |
Occupation |
||||||||
|
|
|
20s |
30s |
40s |
JH |
HS |
C |
S |
M |
D |
Psh |
Bk |
Brg |
|
Robin |
Jets |
|
|
|
|
|
|
|
|
|
|
|
|
|
Bill |
Jets |
|
|
|
|
|
|
|
|
|
|
|
|
|
Mike |
Jets |
|
|
|
|
|
|
|
|
|
|
|
|
|
Joan |
Jets |
|
|
|
|
|
|
|
|
|
|
|
|
|
Catherine |
Jets |
|
|
|
|
|
|
|
|
|
|
|
|
|
John |
Jets |
|
|
|
|
|
|
|
|
|
|
|
|
|
Joshua |
Jets |
|
|
|
|
|
|
|
|
|
|
|
|
|
Bert |
Jets |
|
|
|
|
|
|
|
|
|
|
|
|
|
Margaret |
Sharks |
|
|
|
|
|
|
|
|
|
|
|
|
|
Janet |
Sharks |
|
|
|
|
|
|
|
|
|
|
|
|
|
Alfred |
Sharks |
|
|
|
|
|
|
|
|
|
|
|
|
|
Gerry |
Sharks |
|
|
|
|
|
|
|
|
|
|
|
|
|
Brett |
Sharks |
|
|
|
|
|
|
|
|
|
|
|
|
|
Sandra |
Sharks |
|
|
|
|
|
|
|
|
|
|
|
|
|
Beth |
Sharks |
|
|
|
|
|
|
|
|
|
|
|
|
|
Maria |
Sharks |
|
|
|
|
|
|
|
|
|
|
|
|
Using the patterns that
emerge, explain why some associations may be harder to learn.
________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
Exercise 5: Continue training the network until to the total error is less
than 0.04. Test the network on the patterns in the training set to confirm that
it can make the correct classifications. What is the shape of the error curve
(or learning curve) over the course of training?
____________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
Exercise
6: Apply the perceptron learning rule to solve the
AND problem for w1 = -0.5, w2 = 0.5, and
=
1.5 The excel table in the lecture notes
records the weights and thresholds after each step of learning. You don’t need
to redo it. Use this table to work out how many steps are required to solve the
AND problem. The excel spreadsheet that generated the table is at http://www.itee.uq.edu.au/~cogs2010/CMC_ch7_BPex6.xls
Number of steps =
___________________
Exercise 7: Assuming
=
1, find a set of weights which solves the 3D version of the XOR problem.
Weight 1 = ______________________
Weight 2 = ______________________
Weight 3 = ______________________
Demonstrate that this works:
|
Input
1 |
Input
2 |
Input
3 |
W1 |
W2 |
W3 |
Net
Input |
Output |
|
1 |
1 |
1 |
|
|
|
|
|
|
1 |
0 |
0 |
|
|
|
|
|
|
0 |
1 |
0 |
|
|
|
|
|
|
0 |
0 |
0 |
|
|
|
|
|
Exercises 8 and 9: Omit
Also
Omit Derivation of the BackProp Learning Rule
Exercise 10: To randomize the network weights and biases, select the
"Randomize Weights and Biases" button. Now train the network for 200
epochs and record the results under the heading "Simulation 1" in the
table below. Record the weight values w1 and w2, the bias values for the hidden
and output units, the final error, and whether the network finds a global or
local minimum.
|
Simulation |
Minima |
Error |
w1
|
w2
|
bias1 |
bias2 |
|
1. |
|
|
|
|
|
|
Finally, describe how your network solved the copy task in this
simulation.
______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
Exercise 11: For the copy task, it turns out that there are two global
minima. Repeat the exercise (possibly more than once) to find an alternative
solution. Record the results for the second solution in your simulation table.
|
Simulation |
Minima |
Error |
w1
|
w2
|
bias1 |
bias2 |
|
2. |
|
|
|
|
|
|
Exercise
12: Set the Biases on the hidden and output units to zero, and then select BiasUnfrozen for these units. By freezing
the biases on the hidden and output units at zero, there are only two weights
that can change in this network. It is possible to calculate the range of
possible solutions by testing the networks that result from varying those two
weights. Visualising
the possible networks as a landscape shows immediately where the local and
global minima are. See the lecture notes
(The excel spreadsheet to calculate this graph is at http://www.itee.uq.edu.au/~cogs2010/CMC_ch7_BPex12.xls ) By
setting the weights accordingly,
rerun to find the local and global minima in the frozen bias version of
the 1:1:1 network. Record your results in the table as Simulations 3 and 4.
|
Simulation |
Minima |
Error |
w1
|
w2
|
bias1 |
bias2 |
|
3. |
|
|
|
|
|
|
|
4. |
|
|
|
|
|
|
Exercise 13: Randomize the weights and biases, and
record the output for each of the input patterns.
|
Input Unit 1 |
Unput Unit 2 |
Output |
|
0 |
0 |
|
|
0 |
1 |
|
|
1 |
0 |
|
|
1 |
1 |
|
Explain
why the output and hidden unit activations are all so similar. Hint: In your
explanation refer to the strengths of the connection weights and how that effects the dynamics of the sigmoid activation function.
______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
Exercise 14: Create a graph of the total error and train the network for
100 epochs. What is the Total Summed Squared (TSS) error after 100 epochs?
__________________________
Continue training the network in 100 epoch steps until the TSS is less than
0.05. Describe the learning curve of the network over the course of training
and record the total number of epochs. Note, it is possible that during
training your network will get stuck in a local minima.
If this happens, re-randomize the weights and start again.
______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
Exercise 15: Record the output and hidden unit activations of the
training network for each of the input patterns.
|
Input Unit 1 |
Unput Unit 2 |
Hidden 1 |
Hidden 2 |
Output |
|
0 |
0 |
|
|
|
|
0 |
1 |
|
|
|
|
1 |
0 |
|
|
|
|
1 |
1 |
|
|
|
How has the network internally represented the XOR problem?
_________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
Exercise 16: Try retraining the network with a learning rate of 1.0. Do
you see an appreciable change in the speed of learning?
__________________________________________________________________________________________________________________________________________
What are some possible bad side effects of increasing the learning rate?
_________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
Exercise 17: Now try retraining the network after setting the momentum
to zero (and the learning rate back to 0.25). You will notice that without
momentum, BackProp is much slower to solve the XOR
problem. What feature(s) of the error surface might explain this?
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________