Question

After creating a random forest, I use it to predict against a out of sample test data set. However, consecutive calls to predict produce different results:

pred<-predict(rf, test)
pred1<-predict(rf, test)
which(pred!=pred1)
[1]  327  436  492  555  560  738 1264 1336 1339 1521 1772 1775 1780 1820 1826
[16] 2018 2019 2022 2023 2031 2099 2104 2238 2267 2621 3021 3029 3376 3467

Any ideas on how I'm making this non deterministic?

Was it helpful?

Solution

When using an even number of trees, results are expected to be non deterministic. From the randomForest doc:

NOTE2: Any ties are broken at random, so if this is undesirable, avoid it by using odd number ntree in randomForest().

So if consistent results are desired an odd number of trees must be used.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top