How do we implement a custom loss that backpropagates with PyTorch?

https://datascience.stackexchange.com/questions/75329

11-12-2020
|

Question

In a neural network code written in PyTorch, we have defined and used this custom loss, that should replicate the behavior of the Cross Entropy loss:

def my_loss(output, target):
    global classes    

    v = torch.empty(batchSize)
    xi = torch.empty(batchSize)

    for j in range(0, batchSize):
        v[j] = 0
        for k in range(0, len(classes)):
            v[j] += math.exp(output[j][k]) 

    for j in range(0, batchSize):
        xi[j] = -math.log( math.exp( output[j][target[j]] ) / v[j] )

    loss = torch.mean(xi)
    print(loss)
    loss.requires_grad = True
    return loss

but it doesn't converge to accetable accuracies.

La solution

You should only use pytorch's implementation of math functions, otherwise, torch does not know how to differentiate them. Replace math.exp with torch.exp, math.log with torch.log.

Also, try to use vectorised operations instead of loops as often as you can, because this will be much faster.

Finally, as far as I can see, you are merely reimplementing a log loss in pytorch, any reason why you don't use one that has been implemented by default? (see here or here)

[EDIT]: If after having removed math operations and implemented a vectorised version of the loss, it still does not converge, here are a few pointers on how to debug it:

Check that the loss is correct by calculating the value manually and compare it with what the function outputs
Compute the gradient manually and check that it is the same as the values in loss.grad, after running loss.backward() (more info here)
Monitor the loss and the gradient after a few iterations to check that everything goes right during the training

Licencié sous: CC-BY-SA avec attribution

Non affilié à datascience.stackexchange