10-fold cross-validation in Weka

https://stackoverflow.com/questions/18238417

24-06-2022
|

Question

I am a bit confused as to the difference between 10-fold cross-validation available in Weka and traditional 10-fold cross-validation.I understand the concept of K-fold cross-validation, but from what I have read 10-fold cross-validation in Weka is a little different.

In Weka FIRST, a model is built on ALL data. Only then is 10-fold cross-validation carried out. In traditional 10-fold cross-validation no model is built beforehand, 10 models are built: one with each iteration (Please correct me if I'm wrong!). But if this is the case, what on earth does Weka do during 10-fold cross-validation? Does it again make a model for each of the ten iterations or does it use the previously assembled model. Thanks!

Solution

As far as I know, the cross-validation in Weka (and the other evaluation methods) are only used to estimate the generalisation error. That is, the (implicit) assumption is that you want to use the learned model with data that you didn't give to Weka (also called "validation set"). Hence the model that you get is trained on the entire data.

During the cross-validation, it trains and evaluates a number of different models (10 in your case) to estimate how well the learned model generalises. You don't actually see these models -- they are only used internally. The model that is shown isn't evaluated.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow