randomForest in R: Is there a possibility of calculating casewise confidence intervals?

StackOverflow https://stackoverflow.com/questions/14709711

سؤال

R package randomForest reports mean squared errors for each tree in the forest. I need, however, a measure of confidence for each case in the data. Since randomForest calculates the casewise predictions by averaging the predictions of the single trees, I guess that it should also be possible to calculate a casewise standard error and thus a confidence interval. Can this be done using the output randomForest object (if so: how?) or do I have to dig into the source code?

هل كانت مفيدة؟

المحلول

No need to dig into the source code. You only need to read the documentation. ?predict.randomForest states that one of its arguments is called predict.all:

predict.all Should the predictions of all trees be kept?

So setting that to TRUE will keep a prediction for each case, for each tree, which you can then use to calculate standard error for each case.

I have recently been made aware of this paper by Stefan Wager, Trevor Hastie and Brad Efron which investigates more rigorously the idea of standard errors for the predictions generated by random forests (and other bagged predictors).

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top