Other than the point raised by nouiz, P should not be declared as a T.vector because it will be the result of computation on your vector of values.
Also, to compute something like entropy, you do not need to use Scan (Scan introduces a computation overhead so it should only be used because there's no other way of computing what you want or to reduce memory usage); you can take a approach like this :
values = T.vector('values')
nb_values = values.shape[0]
# For every element in 'values', obtain the total number of times
# its value occurs in 'values'.
# NOTE : I've done the broadcasting a bit more explicitly than
# needed, for clarity.
freqs = T.eq(values[:,None], values[None, :]).sum(0).astype("float32")
# Compute a vector containing, for every value in 'values', the
# probability of that value in the vector 'values'.
# NOTE : these probabilities do *not* sum to 1 because they do not
# correspond to the probability of every element in the vector 'values
# but to the probability of every value in 'values'. For instance, if
# 'values' is [1, 1, 0] then 'probs' will be [2/3, 2/3, 1/3] because the
# value 1 has probability 2/3 and the value 0 has probability 1/3 in
# values'.
probs = freqs / nb_values
entropy = -T.sum(T.log2(probs) / nb_values)
fct = theano.function([values], entropy)
# Will output 0.918296...
print fct([0, 1, 1])