Template code is here.

1. Run backpropagation on the XOR problem (xorbackprop.m), using one hidden layer with 2 units. You'll see that it learns the task correctly sometimes and fails other times. Look at the two hidden features it learns, meaning the output activations a{1} for the four possible test stimuli. Try to find a pattern for what these features look like when the model succeeds versus when it fails. Keep in mind that the model can solve the task only if the correct output can be expressed as a linear combination of these features (via the final layer of weights).

Loop through the test stimuli

Calculate network activation using the current stimulus

Get the output activations of the hidden units

Use the augmented stimulus matrix and the first layer of weights to get the input activations of the hidden layer

Use the tanh transfer function to get output activations for the hidden layer (these are the feature values)

2. Simulate the backprop model on some other interesting task. Define a set of stimuli (each a vector of cue values) and corresponding target outputs (either a single number for each stimulus, or a vector). Explore the number of hidden units, learning rate, and number of training trials to see what values lead to successful learning.

If you can't think of an interesting task,

Plan out how you'd modify the model code to model your new task.

1 1 1 1 0 0 1 1 1 1; ... %top-right vertical line

1 0 1 1 1 1 1 1 1 1; ... %bottom-right vertical line

0 1 1 0 1 1 0 1 1 1; ... %bottom horizontal line

0 1 0 0 0 1 0 1 0 1; ... %bottom-left vertical line

0 0 0 1 1 1 0 1 1 1; ... %top-left vertical line

0 1 1 1 1 1 0 1 1 0]; %middle horizontal line

teachingVals = eye(10); %for each stimulus, correct response vector has 1 corresponding to entry for that stimulus and zero elsewhere

Create a sequence of stimulus IDs, indicating which stimulus will be shown on each trial

stimSeq = randi(nstim,n,1); %random sequence of n stimuli

You'll also need to update the line that defines the numbers of units in the input and output layers, to base these numbers on the stimulus and feedback matrices you defined

On every trial, use the stimulus ID for this trial, the stimulus matrix, and the feedback matrix to determine the input to the network and the correct output (teaching signal)

T = teachingVals(:,stimSeq(i)); %teaching signal for this trial

Use the input and teaching signal to calculate network activation and to learn by back-propagation

Delta = backprop(stim,v,a,W,T); %get weight updates by backpropagation

Change the code that tests the model's learning, to use a metric appropriate for your new task
**(this goes outside the loop over trials)**

...

for j=1:nstim %loop through test trials; will test model's performance on every stimulus type

perf(i,j) = sum((v{M}-teachingVals(:,j)).^2); %sum squared error: activation at output layer vs. correct response

...

plot(perf) %shows one learning curve (error as a function of trials) for each stimulus; perfect learning is indicated by all curves reaching zero

axis([0 n 0 2]) %fix vertical scaling so that convergence to zero is visible

3. Change the model to use rectified linear units, meaning *f*_{act}(*v*) = max{*v*,0}. Test the model on either XOR or your new task, and see whether it performs better or worse than with the tanh activation function.

for i=1:length(v{m})

Change backprop.m to use the derivative of the rectified-linear function in place of the derivative of tanh
*f*_{act} is 1 when *v* > 0, and 0 when *v* < 0 (hint: this means the derivative can be written using a Boolean expression)