How do we implement a custom loss that backpropagates with PyTorch?
-
11-12-2020 - |
Question
In a neural network code written in PyTorch, we have defined and used this custom loss, that should replicate the behavior of the Cross Entropy loss:
def my_loss(output, target):
global classes
v = torch.empty(batchSize)
xi = torch.empty(batchSize)
for j in range(0, batchSize):
v[j] = 0
for k in range(0, len(classes)):
v[j] += math.exp(output[j][k])
for j in range(0, batchSize):
xi[j] = -math.log( math.exp( output[j][target[j]] ) / v[j] )
loss = torch.mean(xi)
print(loss)
loss.requires_grad = True
return loss
but it doesn't converge to accetable accuracies.
La solution
You should only use pytorch's implementation of math functions, otherwise, torch does not know how to differentiate them. Replace math.exp
with torch.exp
, math.log
with torch.log
.
Also, try to use vectorised operations instead of loops as often as you can, because this will be much faster.
Finally, as far as I can see, you are merely reimplementing a log loss in pytorch, any reason why you don't use one that has been implemented by default? (see here or here)
[EDIT]: If after having removed math operations and implemented a vectorised version of the loss, it still does not converge, here are a few pointers on how to debug it:
- Check that the loss is correct by calculating the value manually and compare it with what the function outputs
- Compute the gradient manually and check that it is the same as the values in
loss.grad
, after runningloss.backward()
(more info here) - Monitor the loss and the gradient after a few iterations to check that everything goes right during the training