Basic DNN with highly imbalanced dataset — network predicts same labels [closed]
-
31-10-2019 - |
Question
I will try to explain my issue at a high level, and I hope I'd be able to get some better understanding of the ML behind it all. I am working with aggregated features extracted from audio files, so each feature vector is of size (1xN). The output would be a single sentiment label, Positive, Neutral, or Negative. I mapped these to 2, 1, 0 respectively (the labels are discrete by design, but maybe I could make it continuous?)
The dataset I am using is 90% neutral, 6% negative, and 4% positive, and I split these into train/dev/test. I wrote up a basic DNN in PyTorch, and have been training using CrossEntropyLoss and SGD (with nesterov momentum). The issue I am running into is that the network, after seeing only ~10% of the data, starts to predict only netural labels. The class weights converge to something like
tensor([[-0.9255],
[ 1.9352],
[-1.1473]])
no matter what 1xN feature vectors you feed in. I would appreciate guidance on how to address this issue. For reference, the architecture is
DNNModel(
(in_layer): Linear(in_features=89, out_features=1024, bias=True)
(fcs): Sequential(
(0): Linear(in_features=1024, out_features=512, bias=True)
(1): Linear(in_features=512, out_features=256, bias=True)
(2): Linear(in_features=256, out_features=128, bias=True)
)
(out_layer): Sequential(
(0): SequenceWise (
Linear(in_features=128, out_features=3, bias=True))
)
)
def forward(self, x):
x = F.relu(self.in_layer(x))
for fc in self.fcs:
x = F.relu(fc(x))
x = self.out_layer(x)
return x
Not sure if NN actually makes sense -- could it be the relus between each hidden layer or the bias? Or something else?
Reposted from Stack Overflow here, since this forum is more appropriate: link
No correct solution