Cost of greater than 1, is there an error?
-
16-10-2019 - |
문제
I'm computing cost in the following way:
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(y, y_)
cost = tf.reduce_mean(cross_entropy);
For the first cost, I am getting 0.693147, which is to be expected on a binary classification when parameters/weights are initialized to 0.
I am using one_hot labels.
However, after completing a training epoch using stochastic gradient descent I am finding a cost of greater than 1.
Is this to be expected?
해결책
The following piece of code does essentially what TF's softmax_cross_entropy_with_logits
functions does (crossentropy on softmaxed y_
and y
):
import scipy as sp
import numpy as np
def softmax(x):
e_x = np.exp(x - np.max(x))
return e_x / e_x.sum(axis=0)
def crossentropy(true, pred):
epsilon = 1e-15
pred = sp.maximum(epsilon, pred)
pred = sp.minimum(1-epsilon, pred)
ll = -sum(
true * sp.log(pred) + \
sp.subtract(1,true) * \
sp.log(sp.subtract(1, pred))
) / len(true)
return ll
==
true = [1., 0.]
pred = [5.0, 0.5]
true = softmax(true)
pred = softmax(pred)
print true
print pred
print crossentropy(true, pred)
==
[ 0.73105858 0.26894142]
[ 0.98901306 0.01098694]
1.22128414101
As you can see there is no reason why crossentropy on binary classification cannot be > 1 and it's not hard to come up with such example.
** Crossentropy above is calculated as in https://www.kaggle.com/wiki/LogarithmicLoss, softmax as in https://en.wikipedia.org/wiki/Softmax_function
UPD: there is a great explanation of what it means when logloss is > 1 at SO: https://stackoverflow.com/a/35015188/1166478