Collision entropy definition
-
29-09-2020 - |
Question
The collision entropy is defined as the Renyi entropy for the case $\alpha = 2$. It is given by
$$\mathrm{H}_{2}(X)=-\log \sum_{i=1}^{n} p_{i}^{2} \tag{1}$$
Take two random variables $X$ and $X'$ which follow the same probability distribution. The probability of a collision is then simply $P_{\rm coll} = \sum_{i=1}^{n} p_{i}^{2}$. I would then expect that we say the collision entropy is just $H(P_{\rm coll})$ i.e.
$$-\left(\sum_{i=1}^{n} p_{i}^{2}\right)\log\left(\sum_{i=1}^{n} p_{i}^{2}\right) - \left(1 -\sum_{i=1}^{n} p_{i}^{2}\right)\log\left(1-\sum_{i=1}^{n} p_{i}^{2}\right)$$
This is in analogy with the binary entropy but with the probability replaced with the probability of a collision.
What is the motivation behind choosing $(1)$ to be the definition of collision entropy?
Solution
When $X$ is distributed uniformly over a domain of size $n$, then Shannon entropy, collision entropy and min-entropy are all equal to $\log n$. In a sense, all of these parameters measure the amount of uncertainty in $X$.
In contrast, your proposed definition is always between $0$ and $1$, tending to zero as $X$ gets more unpredictable. This is quite different from other notions of entropy, in which zero stands for predictable.