Trivial random forest with OpenCV doesn't work and isn't the same as sklearn

https://stackoverflow.com/questions/21836459

12-10-2022
|

Question

I'm trying to get the simplest example of random forest to work. The training data is 2 points {0,0} with a label 0 and {1,1} with a label 1. The sample to predict is {2,2}. OpenCV returns 0 rather than 1. Here is the OpenCV code in C++ (main.cpp):

#include <iostream>
#include <opencv2/core/core.hpp>
#include <opencv2/ml/ml.hpp>

using namespace std;
using namespace cv;

int main(int argc, char const *argv[]) {
  cout << " hi \n";
  float trainingData[2][2] = { {0.0, 0.0}, {1.0, 1.0}};
  Mat training_data(2, 2, CV_32FC1, trainingData);
  float trainingClass[2] = {0.0,1.0};
  Mat training_class(2, 1, CV_32FC1, trainingClass);
  CvRTrees rtree;
  rtree.train(training_data, CV_ROW_SAMPLE, training_class);
  float sampleData[2] = {2.0, 2.0};
  Mat sample_data(2, 1, CV_32FC1, sampleData);
  cout << rtree.predict(sample_data) << "  <-- predict\n";
  return 0;
}

cmake file:

cmake_minimum_required(VERSION 2.8)
project( main )
find_package( OpenCV REQUIRED )
add_executable( main main.cpp )
target_link_libraries( main ${OpenCV_LIBS} )

running:

> cmake .;make;./main
 hi 
0  <-- predict

To compare, here is a python's sklearn code (rfc.py):

from sklearn.ensemble import RandomForestClassifier
X = [[0, 0], [1, 1]]
Y = [0, 1]
clf = RandomForestClassifier(n_estimators=10)
clf = clf.fit(X, Y)
print clf.predict([[2., 2.]])

running:

> python rfc.py 
[1]

Solution

the number of points to train on is too little. if I change it to 3, everything works.

changing min_sample_count to 2 also works.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow