libsvm's "grid.py" try to optimize only two parameters "c" and "g" of svm-train. I wanted to extend "grid.py" to optimize for other parameters (for example "r" or "d") by running "grid.py" again and again for different parameters. I have some questions
1. Is there any script already which can optimize parameters other then "c" and "g"?
2. Which parameters are more crucial and what are there maximum/minimum range. Sometime changing/optimizing one parameter automatically optimizes other parameter. Is it the case with svm-train parameters?
One can tune the SVM by changing the parameters \(C, \gamma\) and the kernel function. The function for tuning the parameters available in scikit-learn is called gridSearchCV(). Parameters of this function are defined as: estimator: It is the estimator object which is svm.
In R, SVMs can be tuned in a similar fashion as they are in Python. Mentioned below are the respective parameters for e1071 package: The kernel parameter can be tuned to take “Linear”,”Poly”,”rbf” etc. The gamma value can be tuned by setting the “Gamma” parameter.
SVM maximizes the margin (as drawn in fig. 1) by learning a suitable decision boundary/decision surface/separating hyperplane. Second, SVM maximizes the geometric margin (as already defined, and shown below in figure 2) by learning a suitable decision boundary/decision surface/separating hyperplane.
In SVM, to avoid overfitting, we choose a Soft Margin, instead of a Hard one i.e. we let some data points enter our margin intentionally (but we still penalize it) so that our classifier don't overfit on our training sample. Here comes an important parameter Gamma (γ), which control Overfitting in SVM.
As far as I know there is no script that does this, however I don't see why grid.py couldn't easily be extended to do so. However, I don't think its worth the effort.
First of all, you need to choose your kernel. This is a parameter in itself. Each kernel has a different set of parameters, and will perform differently, so in order to compare kernels you will have to optimize each kernel's parameters.
C, the cost parameter is an overall parameter that applies to SVM itself. The other parameters are all inputs to the kernel function. C controls the tradeoff between wide margin and more training points misclassified (but a model which may generalize better to future data) and a narrow margin which fits the training points better but may be overfitted to the training data.
Generally, the two most widely used kernels are linear (which requires no parameters) and the RBF kernel.
The RBF kernel takes the gamma parameter. This must be optimized, its value will significantly affect performance.
If you are using the Polynomial kernel, d is the main parameter, you would optimize that. It doesn't make sense to modify the other parameters from the default unless you have some mathematical reason why doing so would better fit your data. In my experience the polynomial kernel can give good results, but a minuscule increase if any over the RBF kernel at a huge computational cost.
Similar with the sigmoid kernel, gamma is your main parameter, optimize that and leave coef0 at the default, unless you have a good understanding of why this would better fit your data.
So the reason why grid.py does not optimize other parameters is because in most cases its simply unnecessary and generally won't result in an improvement in performance. As for your second question: No, this is not a case where optimizing one will optimize the other. The optimal values of these parameters are specific to your dataset. Changing the value of the kernel parameters will affect the optimal value of C. This is why a grid search is recommended. Adding these extra parameters to your search is going to significantly increase the time it will take and unlikely to give you an increase in classifier performance.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With