I can't seem to correctly pass in the parameters to train a Random Forest classifier in opencv from python.
I wrote an implementation in C++ which worked correctly, but do not get the same results in python.
I found some sample code here: http://fossies.org/linux/misc/opencv-2.4.7.tar.gz:a/opencv-2.4.7/samples/python2/letter_recog.py
which seems to indicate that you should pass in the parameters in a dict. Here is the code I am using:
rtree_params = dict(max_depth=11, min_sample_count=5, use_surrogates=False, max_categories=15, calc_var_importance=False, n_active_vars=0, max_num_of_trees_in_the_forest=1000, termcrit_type=cv2.TERM_CRITERIA_MAX_ITER)
classifier = cv2.RTrees()
classifier.train(train_data, cv2.CV_ROW_SAMPLE, label_data, params=rtree_params);
I can tell that the classifier is getting trained correctly, but it is not nearly as accurate as the one I trained with the same parameters in C++. I'm fairly certain that the parameters are getting acknowledged, because I get different results when I tweak the values.
I did notice that when I output the classifier to a file, it only has one tree. I'm pretty sure this is the problem. I looked at the openCV implementation:
http://www.code.opencv.org/svn/gsoc2012/denoising/trunk/opencv-2.4.2/modules/ml/src/rtrees.cpp
Given my parameters, it should output a forest with 1000 trees. I tried setting the max_num_of_trees_in_the_forest
arguments to all sorts of crazy values, and it didn't change OpenCV's behaviour.
Thoughts?
Not sure if this will help much, but I believe:
n_active_vars=0
should be
nactive_vars=0
Also, you may wish to try experimenting with the term_crit parameter. For example, try adding:
term_crit=(cv2.TERM_CRITERIA_MAX_ITER,1000,1)
into your dictionary.
I believe this will set the criteria to terminate when 1000 trees are added into the forest.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With