What would be a good loss function to penalize big differences and reward small ones, but not in a linear way?

datascience.stackexchange https://datascience.stackexchange.com/questions/67692

  •  08-12-2020
  •  | 
  •  

Pergunta

I have an image with the differences between 2 other images. Concentrations of black pixels mean similar regions between the images, whereas, white values highlight differences. On the left: the difference from the generation process. On the right: the ground truth

Thus I want a function to provide rewards when the pixels are the same (or relatively similar - a term to weight this "similarity threshold" would be nice) and penalize when the differences are bigger (penalizing more as the differences grow).

A differentiable function is much appreciated.

So in the context of machine learning and this loss function being a way to help train a generator, what kind of function do you recommend or can come up with?

Remember, the ideia is to reward similarities and penalize differences (such that "really different" equates to a bigger loss than "slightly off" or "different").

Thanks in advance to you all!

Foi útil?

Solução

Square loss (MSE or SSE) does this. Let $y_i$ be an actual value and $\hat{y}_i$ be its estimated value (prediction).

$$SSE = \sum (y_i -\hat{y}_i)^2$$

$$MSE=\dfrac{SSE}{n}$$

Except for numerical issues of doing math on a computer, these are optimized at the same parameter values of your neural network.

The squaring is critical. If a prediction is off by 1 unit, it incurs one unit of loss. If the prediction is off by 2, instead of incurring 2 units of loss, there are 4 units of loss—much worse than being off by 1. If the prediction is off by 3, wow—9 units of loss!

(If you look at statistics literature or some Cross Validated posts, you may see $n-p$ in the denominator of MSE, where $p$ is the number of parameters in the regression equation. This does not change the optimal value but does have some advantages in linear regression, chiefly that it is an unbiased estimate of the error variance under common assumptions for linear regression that you are unlikely to make in a neural network problem.)

Licenciado em: CC-BY-SA com atribuição
scroll top