Question

I am a newbie to PyTorch. I was trying out the following network architecture to train a multi-class classifier. I used Softmax at the output layer and cross entropy as the loss function. However, the output doesn’t look like probabilities. For example, one of the outputs looks like this [2.0032e-10, 1.798e-8, …1.0000e+0,…2.112e-4]. My question is, how can one of them be 1 when their sum has to be equal to 1.`class Net(nn.Module):

def __init__(self):
    super(Net, self).__init__()
    # 13 features in input layer
    self.fc1 = nn.Linear(13, 512)
    self.fc2 = nn.Linear(512, 512)
    self.fc3 = nn.Linear(512, 512)
    self.fc4 = nn.Linear(512, 512)
    self.fc5 = nn.Linear(512, 40)
    self.bn1 = nn.BatchNorm1d(512)
    self.bn2 = nn.BatchNorm1d(512)
    self.bn3 = nn.BatchNorm1d(512)
    self.bn4 = nn.BatchNorm1d(512)


def forward(self, x):
    x = self.bn1(F.relu(self.fc1(x)))
    x = self.bn2(F.relu(self.fc2(x)))
    x = self.bn3(F.relu(self.fc3(x)))
    x = self.bn4(F.relu(self.fc4(x)))
    x = nn.Softmax(dim=1)(self.fc5(x))
    return x
`

If softmax is removed, output range is not between 0 and 1. It contains negative values as well like [-12.098, 2.0988, -12.121…, 0.87, 0.21]. But I need probabilities for each of the classes. How can I achieve that without/with using Softmax?

Was it helpful?

Solution

Solved. I got output as probabilities by sending the predicted values to softmax but didn’t include softmax in the net architecture as cross-entropy already applies softmax internally. Thank you.

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top