is it possible get a overfit underfit comparation between models, with this chart? (homework) [closed]

https://datascience.stackexchange.com/questions/76779

12-12-2020
|

Pregunta

I am trying to interpret this chart.

I am not sure how to interpret this, because, I think that the fact of the for examples LGBM Validation error, is wide and similar to train boxplot, there arent problem of overfitting, but when I see another type of charts of the execution of LGBM, I can see that really the LGBM is overfitted, so really I don't know how to interpret this of the correct way.

But I don't know how could interpret beyond this:

LightGBM is maybe the best option because it is faster and finally you can get enough accuracy with that, and in comparison with the other two, bagging have less overfit because of the differences between the error is less.

Any idea?

Thanks

Solución

Your chart seems to show that light GBM models are very inconsistent in terms of F1 score. The other two types of model tend to have lower validation accuracy than training accuracy, suggesting overfitting is occurring to some extent (but this is ubiquitous in machine learning so it’s not a deal breaker by any means). The best median validation performance is by RandomForest, however some outliers underperformed the models using bagging. Possibly a good approach would be to have an ensemble of RandomForest models.

Licenciado bajo: CC-BY-SA con atribución

No afiliado a datascience.stackexchange