Assessing significance / confidence of a crossvalidated performance measure
-
16-10-2019 - |
문제
I have a prediction model $P$ and I use some performance measure $I$ to measure $P$'s accuracy. The distribution of $I$ is unknown (it's a custom metric, which is somehow similar to the precision metric).
My validation prediction is as following:
- Randomly split the data to $k$ stratified folds
- Fit $k$ models
- Estimate each model according to $I$ (which results $k$-crossvalidated values of $I$)
- The final model prediction performance is calculated by the average of $k$-crossvalidated $I$ measures.
I would like to perform some significance testing - to be able to say the confidence for the model prediction performance - to what extent I am “sure” in this $I$?
해결책
You may use bootstrapping to estimate a confidence interval for the prediction error. Further help can be found in some Stanford online course slides, but I haven't done this.
Besides that, it should be no problem to compare the estimated performance obtained with cross-validation (mean and standard deviation) with a reference point (e.g., AUC = 0.5) or with the results of another benchmark model such as logistic regression or nearest-neighbor classifier using a simple statistical test using a given level of confidence.
A similar question is discussed here.