I got very different training efficiency with the following network
net = patternnet(hiddenLayerSize);
and the following one
net = feedforwardnet(hiddenLayerSize, 'trainscg');
net.layers{1}.transferFcn = 'tansig';
net.layers{2}.transferFcn = 'softmax';
net.performFcn = 'crossentropy';
on the same data.
I was thinking networks should be the same.
What thing I forgot?
UPDATE
The code below demonstrates, that network behavior is uniquely depends on network creation function.
Each type of network was ran two times. This excludes random generator issues or something. Data is the same.
hiddenLayerSize = 10;
% pass 1, with patternnet
net = patternnet(hiddenLayerSize);
net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;
[net,tr] = train(net,x,t);
y = net(x);
performance = perform(net,t,y);
fprintf('pass 1, patternnet, performance: %f\n', performance);
fprintf('num_epochs: %d, stop: %s\n', tr.num_epochs, tr.stop);
% pass 2, with feedforwardnet
net = feedforwardnet(hiddenLayerSize, 'trainscg');
net.layers{1}.transferFcn = 'tansig';
net.layers{2}.transferFcn = 'softmax';
net.performFcn = 'crossentropy';
net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;
[net,tr] = train(net,x,t);
y = net(x);
performance = perform(net,t,y);
fprintf('pass 2, feedforwardnet, performance: %f\n', performance);
fprintf('num_epochs: %d, stop: %s\n', tr.num_epochs, tr.stop);
% pass 1, with patternnet
net = patternnet(hiddenLayerSize);
net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;
[net,tr] = train(net,x,t);
y = net(x);
performance = perform(net,t,y);
fprintf('pass 3, patternnet, performance: %f\n', performance);
fprintf('num_epochs: %d, stop: %s\n', tr.num_epochs, tr.stop);
% pass 2, with feedforwardnet
net = feedforwardnet(hiddenLayerSize, 'trainscg');
net.layers{1}.transferFcn = 'tansig';
net.layers{2}.transferFcn = 'softmax';
net.performFcn = 'crossentropy';
net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;
[net,tr] = train(net,x,t);
y = net(x);
performance = perform(net,t,y);
fprintf('pass 4, feedforwardnet, performance: %f\n', performance);
fprintf('num_epochs: %d, stop: %s\n', tr.num_epochs, tr.stop);
Output follows:
pass 1, patternnet, performance: 0.116445
num_epochs: 353, stop: Validation stop.
pass 2, feedforwardnet, performance: 0.693561
num_epochs: 260, stop: Validation stop.
pass 3, patternnet, performance: 0.116445
num_epochs: 353, stop: Validation stop.
pass 4, feedforwardnet, performance: 0.693561
num_epochs: 260, stop: Validation stop.
Looks like those two aren't quite the same:
>> net = patternnet(hiddenLayerSize);
>> net2 = feedforwardnet(hiddenLayerSize,'trainscg');
>> net.outputs{2}.processParams{2}
ans =
ymin: 0
ymax: 1
>> net2.outputs{2}.processParams{2}
ans =
ymin: -1
ymax: 1
The net.outputs{2}.processFcns{2}
is mapminmax
so I gather that one of these is re-scaling it's output to match the output range of your real data better.
For future reference, you can do nasty dirty things like compare the interior data structures by casting to struct. So I did something like
n = struct(net); n2 = struct(net2);
for fn=fieldnames(n)';
if(~isequaln(n.(fn{1}),n2.(fn{1})))
fprintf('fields %s differ\n', fn{1});
end
end
to help pinpoint the differences.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With