Apokalyptik's answer is the closest to what you want. Readers are streamers so you can't just hop to a random place (per-se).
Naively choosing a probability against which you keep any given line as you read it in can lead to problems: you may get to the end of the file without holding enough lines of input, or you may be too quick to hold lines and not get a good sample. Either is much more likely than guessing correctly, since you don't know beforehand how many lines are in the file (unless you first iterate it once to count them).
What you really need is reservoir sampling.
Basically, read the file line-by-line. Each line, you choose whether to hold it like so: The first line you read, you have a 1/1
chance of holding it. After you read the second line, you have 1/2
chance of replacing what you're holding with this one. After the third line, you have a 1/2 * 2/3 = 1/3
chance of holding onto that one instead. Thus you have a 1/N
chance of holding onto any given line, where N
is the number of lines you've read in. Here's a more detailed look at the algorithm (don't try to implement it just from what I've told you in this paragraph alone).