Template code is here.
1. Run backpropagation on the XOR problem (xorbackprop.m), using one hidden layer with 2 units. You'll see that it learns the task correctly sometimes and fails other times. Look at the two hidden features it learns, meaning the output activations a{1} for the four possible test stimuli. Try to find a pattern for what these features look like when the model succeeds versus when it fails. Keep in mind that the model can solve the task only if the correct output can be expressed as a linear combination of these features (via the final layer of weights).
Loop through the test stimuli
Calculate network activation using the current stimulus
Get the output activations of the hidden units
Use the augmented stimulus matrix and the first layer of weights to get the input activations of the hidden layer
Use the tanh transfer function to get output activations for the hidden layer (these are the feature values)
2. Simulate the backprop model on some other interesting task. Define a set of stimuli (each a vector of cue values) and corresponding target outputs (either a single number for each stimulus, or a vector). Explore the number of hidden units, learning rate, and number of training trials to see what values lead to successful learning.
If you can't think of an interesting task,
Plan out how you'd modify the model code to model your new task.
1 1 1 1 0 0 1 1 1 1; ... %top-right vertical line
1 0 1 1 1 1 1 1 1 1; ... %bottom-right vertical line
0 1 1 0 1 1 0 1 1 1; ... %bottom horizontal line
0 1 0 0 0 1 0 1 0 1; ... %bottom-left vertical line
0 0 0 1 1 1 0 1 1 1; ... %top-left vertical line
0 1 1 1 1 1 0 1 1 0]; %middle horizontal line
teachingVals = eye(10); %for each stimulus, correct response vector has 1 corresponding to entry for that stimulus and zero elsewhere
Create a sequence of stimulus IDs, indicating which stimulus will be shown on each trial
stimSeq = randi(nstim,n,1); %random sequence of n stimuli
You'll also need to update the line that defines the numbers of units in the input and output layers, to base these numbers on the stimulus and feedback matrices you defined
On every trial, use the stimulus ID for this trial, the stimulus matrix, and the feedback matrix to determine the input to the network and the correct output (teaching signal)
T = teachingVals(:,stimSeq(i)); %teaching signal for this trial
Use the input and teaching signal to calculate network activation and to learn by back-propagation
Delta = backprop(stim,v,a,W,T); %get weight updates by backpropagation
Change the code that tests the model's learning, to use a metric appropriate for your new task
...
for j=1:nstim %loop through test trials; will test model's performance on every stimulus type
perf(i,j) = sum((v{M}-teachingVals(:,j)).^2); %sum squared error: activation at output layer vs. correct response
...
plot(perf) %shows one learning curve (error as a function of trials) for each stimulus; perfect learning is indicated by all curves reaching zero
axis([0 n 0 2]) %fix vertical scaling so that convergence to zero is visible
3. Change the model to use rectified linear units, meaning fact(v) = max{v,0}. Test the model on either XOR or your new task, and see whether it performs better or worse than with the tanh activation function.
for i=1:length(v{m})
Change backprop.m to use the derivative of the rectified-linear function in place of the derivative of tanh