Question

I am trying to build a binary classifier. I have tried deep neural networks with various different structures and parameters and I was not able to get anything better than

Train set accuracy : 0.70102
Test set accuracy : 0.70001

Then I tried machine learning algorithms such as KNN and Decision Trees etc. And I found that Random forest Classifier from Scikit-learn with n_estimators=100 gave me

Train set accuracy : 1.0
Test set accuracy : 0.924068

I tried adjusting other parameters such as max_depth, criterion But the decrease in training set accuracy also caused the test set accuracy to drop. Like

Train set accuracy : 0.82002
Test set accuracy : 0.75222

My question is, is this

Train set accuracy : 1.0
Test set accuracy : 0.924068

acceptable ? Even thought the model is over fitting, the test set accuracy is better.

Was it helpful?

Solution

If you properly isolate your test set such that it doesn't affect training, you should only look at the test set accuracy. Here are some of my remarks:

  • Having your model being really good on the train set is not a bad thing in itself. On the contrary, if the test accuracy is identical, you want to pick the model with the better train accuracy.
  • You want to look at the test accuracy. That is your primary concern. So pick the model that provides the best performance on the test set.
    • Overfitting is not when your train accuracy is really high (or even 100%). It is when your train accuracy is high and your test accuracy is low.
    • it is not abnormal that your train accuracy is higher than your test accuracy. After all, your model has an advantage with the train set since it's been given the correct answer already.

At the end of the day, training a machine learning model is like studying for a test. You (the model) use learning resources such as books, past exams, flash cards etc. (train set) to perform well on a test/exam (test set). Knowing your learning resources perfectly doesn't mean you are overfitting. You would be overfitting if this is all you knew and couldn't perform well on the exam at all.

OTHER TIPS

The purpose of a model will always be to minimize loss. not increase accuracy. so parameters of any model using any optimizer like adam optimizer(a common optimizer), will try to gain momentum towards parameter values where the loss is least, in other words "minimum deviation".

models can overfit when:

  • data is small
  • Train to Test ratio imbalance
  • model has improper gates or neurons are too rigid.( neurons give high weight age to most recent input and remain locked without considering other inputs).
  • in DNN's when intermediate weights dont have a forget/reset factor. usually 0.2

but in your case we dont have extreme accuracy like 0.99< . so its safe to say that your model is performing good and is not overfitting. good models do not overfit, they strictly converge to an arbitrary value as 0.924 in your case.

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top