Octave : logistic regression : difference between fmincg and fminunc

Tags:

I often use fminunc for a logistic regression problem. I have read on web that Andrew Ng uses fmincg instead of fminunc, with same arguments. The results are different, and often fmincg is more exact, but not too much. (I am comparing the results of fmincg function fminunc against the same data)

So, my question is : what is the difference between these two functions? What algorithm does each function have implemented? (Now, I just use these functions without knowing exactly how they work).

Thanks :)

202

asked Aug 24 '12 18:08

hqt

2 Answers

You will have to look inside the code of fmincg because it is not part of Octave. After some search I found that it's a function file provided by the Machine Learning class of Coursera as part of the homework. Read the comments and answers on this question for a discussion about the algorithms.

186

answered Sep 17 '22 23:09

carandraug

In contrast to other answers here suggesting the primary difference between fmincg and fminunc is accuracy or speed, perhaps the most important difference for some applications is memory efficiency. In programming exercise 4 (i.e., Neural Network Training) of Andrew Ng's Machine Learning class at Coursera, the comment in ex4.m about fmincg is

%% =================== Part 8: Training NN ===================
% You have now implemented all the code necessary to train a neural
% network. To train your neural network, we will now use "fmincg", which
% is a function which works similarly to "fminunc". Recall that these
% advanced optimizers are able to train our cost functions efficiently as
% long as we provide them with the gradient computations.

Like the original poster, I was also curious about how the results of ex4.m might differ using fminunc instead of fmincg. So I tried to replace the fmincg call

options = optimset('MaxIter', 50); [nn_params, cost] = fmincg(costFunction, initial_nn_params, options);

with the following call to fminunc

options = optimset('GradObj', 'on', 'MaxIter', 50); [nn_params, cost, exit_flag] = fminunc(costFunction, initial_nn_params, options);

but got the following error message from a 32-bit build of Octave running on Windows:

error: memory exhausted or requested size too large for range of Octave's index type -- trying to return to prompt

A 32-bit build of MATLAB running on Windows provides a more detailed error message:

Error using find
Out of memory. Type HELP MEMORY for your options.
Error in spones (line 14)
[i,j] = find(S);
Error in color (line 26)
J = spones(J);
Error in sfminbx (line 155)
group = color(Hstr,p);
Error in fminunc (line 408)
[x,FVAL,~,EXITFLAG,OUTPUT,GRAD,HESSIAN] = sfminbx(funfcn,x,l,u, ...
Error in ex4 (line 205)
[nn_params, cost, exit_flag] = fminunc(costFunction, initial_nn_params, options);

The MATLAB memory command on my laptop computer reports:

Maximum possible array: 2046 MB (2.146e+09 bytes) *
Memory available for all arrays: 3402 MB (3.568e+09 bytes) **
Memory used by MATLAB: 373 MB (3.910e+08 bytes)
Physical Memory (RAM): 3561 MB (3.734e+09 bytes)
* Limited by contiguous virtual address space available.
** Limited by virtual address space available.

I previously was thinking that Professor Ng chose to use fmincg to train the ex4.m neural network (that has 400 input features, 401 including the bias input) to increase the training speed. However, now I believe his reason for using fmincg was to increase the memory efficiency enough to allow the training to be performed on 32-bit builds of Octave/MATLAB. A short discussion about the necessary work to get a 64-bit build of Octave that runs on Windows OS is here.

answered Sep 21 '22 23:09

gregS

Related questions
                            
                                Mastering Recursive Programming [closed]
                            
                                Difference between AVL trees and splay trees
                            
                                Given two arrays, find the permutations that give closest distance between two arrays
                            
                                How do you find a point at a given perpendicular distance from a line?
                            
                                When should we use Radix sort?
                            
                                Coupon code generation
                            
                                Why is the constant always dropped from big O analysis?
                            
                                What is the best way to find all combinations of items in an array?
                            
                                Division without using '/'
                            
                                How do I efficiently determine if a polygon is convex, non-convex or complex?
                            
                                math/algorithm Fit image to screen retain aspect ratio
                            
                                fast algorithm for drawing filled circles?
                            
                                Find 2 numbers in an unsorted array equal to a given sum
                            
                                Why is the size 127 (prime) better than 128 for a hash-table?
                            
                                Non-intersecting line segments while minimizing the cumulative length
                            
                                How to count each digit in a range of integers?
                            
                                Collision detection of huge number of circles
                            
                                Hamming Distance vs. Levenshtein Distance
                            
                                Two elements in array whose xor is maximum
                            
                                Best hashing algorithm in terms of hash collisions and performance for strings

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Octave : logistic regression : difference between fmincg and fminunc

Tags:

algorithm

machine-learning

neural-network

octave

hqt

People also ask

2 Answers

carandraug

gregS

Recent Activity

Donate For Us