It is obvious that the first element (indeed, the first n
elements) has to be chosen with the probability of 1
(the "fill it" stage). In order to remain in the final sample, this first element needs then to survive m-n
possible replacements - that's what brings the probability of it being in the sample down to n/m
. On the other hand, the last element only participates in one replacement; it should thus be added to the sample with the probability of n/m
from the start.
For simplicity, suppose that you only need to pick one element, out of m
, using this replacement strategy (note that you don't know up front what m
is: you just iterate until you suddenly reach the end). You take the first element, and keep it with a probability of 1
(for all you know, this is the only element). Then you discover the second element, and you throw a coin and either keep it or discard it with the probability of 1/2
. At this point, each of the first two elements has 1/2
probability to be the chosen one.
Now you see the third element, and you keep it with probability of 1/3
. Each of the first two elements had 1/2
probability of participating in this encounter, and 2/3
of surviving it - for a total of 1/2 * 2/3 == 1/3
probability of still being around. So, again, each of the first three elements has 1/3
probability of being chosen at this point.
The inductive step of proving that, after t
th element is inspected, each of the first t
elements has 1/t
probability of being chosen is left as an exercise for the reader. So is the extension of the proof to a sample of size n > 1
.