Question

I can't seem to correctly pass in the parameters to train a Random Forest classifier in opencv from python.

I wrote an implementation in C++ which worked correctly, but do not get the same results in python.

I found some sample code here: http://fossies.org/linux/misc/opencv-2.4.7.tar.gz:a/opencv-2.4.7/samples/python2/letter_recog.py

which seems to indicate that you should pass in the parameters in a dict. Here is the code I am using:

rtree_params = dict(max_depth=11, min_sample_count=5, use_surrogates=False, max_categories=15, calc_var_importance=False, n_active_vars=0, max_num_of_trees_in_the_forest=1000, termcrit_type=cv2.TERM_CRITERIA_MAX_ITER)
classifier = cv2.RTrees()
classifier.train(train_data, cv2.CV_ROW_SAMPLE, label_data, params=rtree_params);

I can tell that the classifier is getting trained correctly, but it is not nearly as accurate as the one I trained with the same parameters in C++. I'm fairly certain that the parameters are getting acknowledged, because I get different results when I tweak the values.

I did notice that when I output the classifier to a file, it only has one tree. I'm pretty sure this is the problem. I looked at the openCV implementation:

http://www.code.opencv.org/svn/gsoc2012/denoising/trunk/opencv-2.4.2/modules/ml/src/rtrees.cpp

Given my parameters, it should output a forest with 1000 trees. I tried setting the max_num_of_trees_in_the_forest arguments to all sorts of crazy values, and it didn't change OpenCV's behaviour.

Thoughts?

Was it helpful?

Solution

Not sure if this will help much, but I believe:

n_active_vars=0

should be

nactive_vars=0

Also, you may wish to try experimenting with the term_crit parameter. For example, try adding:

term_crit=(cv2.TERM_CRITERIA_MAX_ITER,1000,1)

into your dictionary.

I believe this will set the criteria to terminate when 1000 trees are added into the forest.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top