COGS2010 Lab Week 6
The BackPropagation Network

 

Prior to the session:

At the lab:

 

 

Exercise 1: Randomize the weights and biases. Cycle the network on each of the input patterns in the training set and inspect the activations on the output units. For each individual, record whether that individual is classified as a Jet or a Shark by comparing the relative activation levels of the Jets and Sharks units.

Name

Initial Classification

Robin

 

Margaret

 

Bill

 

Janet

 

Mike

 

Alfred

 

Joan

 

Gerry

 

Catherine

 

Brett

 

John

 

Sandra

 

Joshua

 

Beth

 

Bert

 

Maria

 

 

Exercise 2: Does the untrained network have a classification bias?

__________________________________________________________________________________________________________________________________________


 

 

Exercise 3:

Note: Train for 5 (rather than 40 epochs) and report the results after 5 epochs

Use the graph tool to graph the output set error. Change the maximum value that can be displayed in the graph from 1 to 10. Now, randomize the weights and biases of the Jets and Sharks network and then train it for 5 epochs. What is the error on the output set after 5 epochs?

_____________________________________________________________________

Exercise 4:

Note: Q 4 has been replaced with the following:

After the network has been trained for 5 epochs, re-test it on the patterns in the training set.  Report which is the most difficult pattern to learn, and explain why

In the following Table, write down the Classification

Name

Correct Classification

Actual Classification

Output Value of winning unit

Robin

Jet

 

 

Margaret

Shark

 

 

Bill

Jet

 

 

Janet

Shark

 

 

Mike

Jet

 

 

Alfred

Shark

 

 

Joan

Jet

 

 

Gerry

Shark

 

 

Catherine

Jet

 

 

Brett

Shark

 

 

John

Jet

 

 

Sandra

Shark

 

 

Joshua

Jet

 

 

Beth

Shark

 

 

Bert

Jet

 

 

Maria

Shark

 

 

_______________________________________________________________________________________________________________________________________________________________________________________________________________


To help you answer the next question, to start off with… tick the appropriate characteristics of each individual.

Name

Gang

Age

Education

Marital Status

Occupation

 

 

20s

30s

40s

JH

HS

C

S

M

D

Psh

Bk

Brg

Robin

Jets

 

 

 

 

 

 

 

 

 

 

 

 

Bill

Jets

 

 

 

 

 

 

 

 

 

 

 

 

Mike

Jets

 

 

 

 

 

 

 

 

 

 

 

 

Joan

Jets

 

 

 

 

 

 

 

 

 

 

 

 

Catherine

Jets

 

 

 

 

 

 

 

 

 

 

 

 

John

Jets

 

 

 

 

 

 

 

 

 

 

 

 

Joshua

Jets

 

 

 

 

 

 

 

 

 

 

 

 

Bert

Jets

 

 

 

 

 

 

 

 

 

 

 

 

Margaret

Sharks

 

 

 

 

 

 

 

 

 

 

 

 

Janet

Sharks

 

 

 

 

 

 

 

 

 

 

 

 

Alfred

Sharks

 

 

 

 

 

 

 

 

 

 

 

 

Gerry

Sharks

 

 

 

 

 

 

 

 

 

 

 

 

Brett

Sharks

 

 

 

 

 

 

 

 

 

 

 

 

Sandra

Sharks

 

 

 

 

 

 

 

 

 

 

 

 

Beth

Sharks

 

 

 

 

 

 

 

 

 

 

 

 

Maria

Sharks

 

 

 

 

 

 

 

 

 

 

 

 

 

Using the patterns that emerge, explain why some associations may be harder to learn.

________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________

Exercise 5: Continue training the network until to the total error is less than 0.04. Test the network on the patterns in the training set to confirm that it can make the correct classifications. What is the shape of the error curve (or learning curve) over the course of training?

____________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________

 

Exercise 6: Apply the perceptron learning rule to solve the AND problem for w1 = -0.5, w2 = 0.5, and = 1.5  The excel table in the lecture notes records the weights and thresholds after each step of learning. You don’t need to redo it. Use this table to work out how many steps are required to solve the AND problem. The excel spreadsheet that generated the table is at http://www.itee.uq.edu.au/~cogs2010/CMC_ch7_BPex6.xls

Number of steps = ___________________

 

Exercise 7: Assuming = 1, find a set of weights which solves the 3D version of the XOR problem.

Weight 1 = ______________________

Weight 2 = ______________________

Weight 3 = ______________________

Demonstrate that this works:

Input 1

Input 2

Input 3

W1

W2

W3

Net Input

Output

1

1

1

 

 

 

 

 

1

0

0

 

 

 

 

 

0

1

0

 

 

 

 

 

0

0

0

 

 

 

 

 

 

 

Exercises 8 and 9: Omit

Also Omit Derivation of the BackProp Learning Rule

 

 

Exercise 10: To randomize the network weights and biases, select the "Randomize Weights and Biases" button. Now train the network for 200 epochs and record the results under the heading "Simulation 1" in the table below. Record the weight values w1 and w2, the bias values for the hidden and output units, the final error, and whether the network finds a global or local minimum.

Simulation

Minima

Error

 w1  

 w2  

bias1

bias2

1.

 

 

 

 

 

 

 


Finally, describe how your network solved the copy task in this simulation.

______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________

 

 

 

Exercise 11: For the copy task, it turns out that there are two global minima. Repeat the exercise (possibly more than once) to find an alternative solution. Record the results for the second solution in your simulation table.

Simulation

Minima

Error

 w1  

 w2  

bias1

bias2

2.

 

 

 

 

 

 

 

 

Exercise 12: Set the Biases on the hidden and output units to zero, and then select BiasUnfrozen for these units.  By freezing the biases on the hidden and output units at zero, there are only two weights that can change in this network. It is possible to calculate the range of possible solutions by testing the networks that result from varying those two weights.  Visualising the possible networks as a landscape shows immediately where the local and global minima are.  See the lecture notes (The excel spreadsheet to calculate this graph is at http://www.itee.uq.edu.au/~cogs2010/CMC_ch7_BPex12.xls ) By setting the weights accordingly, rerun to find the local and global minima in the frozen bias version of the 1:1:1 network. Record your results in the table as Simulations 3 and 4.

 

Simulation

Minima

Error

 w1  

 w2  

bias1

bias2

3.

 

 

 

 

 

 

4.

 

 

 

 

 

 

 

 

Exercise 13: Randomize the weights and biases, and record the output for each of the input patterns.

 

Input Unit 1

Unput Unit 2

Output

0

0

 

0

1

 

1

0

 

1

1

 

 

Explain why the output and hidden unit activations are all so similar. Hint: In your explanation refer to the strengths of the connection weights and how that effects the dynamics of the sigmoid activation function.

 

______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________

 

Exercise 14: Create a graph of the total error and train the network for 100 epochs. What is the Total Summed Squared (TSS) error after 100 epochs?

__________________________

Continue training the network in 100 epoch steps until the TSS is less than 0.05. Describe the learning curve of the network over the course of training and record the total number of epochs. Note, it is possible that during training your network will get stuck in a local minima. If this happens, re-randomize the weights and start again.

______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________

Exercise 15: Record the output and hidden unit activations of the training network for each of the input patterns.

Input Unit 1

Unput Unit 2

Hidden 1

Hidden 2

Output

0

0

 

 

 

0

1

 

 

 

1

0

 

 

 

1

1

 

 

 

 

How has the network internally represented the XOR problem?

_________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________


Exercise 16: Try retraining the network with a learning rate of 1.0. Do you see an appreciable change in the speed of learning?

__________________________________________________________________________________________________________________________________________

What are some possible bad side effects of increasing the learning rate?

_________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________

Exercise 17: Now try retraining the network after setting the momentum to zero (and the learning rate back to 0.25). You will notice that without momentum, BackProp is much slower to solve the XOR problem. What feature(s) of the error surface might explain this?

___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________