Carlification of the MSE loss sum symbol

https://datascience.stackexchange.com/questions/64543

20-10-2020
|

Question

So I have a question regarding the MSE loss on the application of a Neural Network.
Loss function: $\text{MSE} = \frac{1}{2} \sum_{i=1}^{n} (Y_i - \hat{Y_i}) ^ 2$
I am wondering for what the $\sum_{i=1}^{n}$ stands.

Do I sum over the loss of all training examples for each output node in my Neural Network?
Or do I use a single training example and sum over all Neural network output nodes?
Or do I both and sum over all training examples and over all output nodes?

I want to use the MSE loss later than for updating my weights in the Neural Network. What would I do for that?

Solution

I think it depends on what you're doing.

If you want to find the MSE of just one training example, then you do number 2. sum all in one example.

If you want to see if it's stable or converging to stability, you could calculate MSE on all examples independently, and then average them to give you a sense of overall behavior

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange