Multilabel Classification - Overfitting?

https://datascience.stackexchange.com/questions/81382

13-12-2020
|

Question

My task is the following:

To input drug combinations and output renal failure-related symptoms from the drug combinations.

Both the drug combinations and renal-failure related symptoms are represented as one-hot encoded (for example, someone getting symptom 1 and symptom 3 out of a total of 4 symptoms is represented as [1,0,1,0]).

So far, I have ran the data through the following models and they have produced this interesting graph. The left-hand graph depicts the training and validation loss of the models over epochs and the right-hand graph depicts the training and validation accuracy over epochs.

It is almost clear from the left hand graph that in the Baseline model there is overfitting occurring as the training loss is decreasing and the validation loss is increasing over epochs. However, the accuracy graph on the graph suggests that validation accuracy continues to improve, despite the increase in validation accuracy.

After adding dropout and L2 regularisation (Baseline Reg), the validation loss does not go up so much, which seems to resolve the overfitting issue, but the accuracy is very sporadic and is on average worse than the Baseline model.

The question is the following: Is my intuition correct in that both model's results is showing overfitting and should I continue my effort to reduce this effect?

Thanks in advance!

Model Architectures are below:

Model: "baseline"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_6 (Dense)              (884, 800)                603200    
_________________________________________________________________
dense_7 (Dense)              (884, 800)                640800    
_________________________________________________________________
dense_8 (Dense)              (884, 4)                  3204      
=================================================================
Total params: 1,247,204
Trainable params: 1,247,204
Non-trainable params: 0
_________________________________________________________________

Model: "baseline_reg"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_9 (Dense)              (884, 800)                603200    
_________________________________________________________________
dropout (Dropout)            (884, 800)                0         
_________________________________________________________________
dense_10 (Dense)             (884, 800)                640800    
_________________________________________________________________
dense_11 (Dense)             (884, 4)                  3204      
=================================================================
Total params: 1,247,204
Trainable params: 1,247,204
Non-trainable params: 0
_________________________________________________________________

No correct solution

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange