Pregunta

Please show me how I could effectively parse names and store them into memory from text like this:

SMITH          1.006  1.006      1
JOHNSON        0.810  1.816      2
WILLIAMS       0.699  2.515      3
JONES          0.621  3.136      4
BROWN          0.621  3.757      5
DAVIS          0.480  4.237      6
MILLER         0.424  4.660      7
...

This text file contains more than 80K lines. I need only names for random choose; Source of file you could find here dist.all.last

¿Fue útil?

Solución

The lines are whitespace separated, simply loop over the file, and use .split():

with open('dist.all.last') as inputfile:
    names = [line.split()[0] for line in inputfile if line.strip()]

If you need to pick one name at random from that, you could use:

import random
with open('dist.all.last') as inputfile:
    name = None
    for i, line in enumerate(inputfile):
        r = random.randint(0, i)
        if not r and line.strip():
            name = line.split()[0]

which makes a selection without keeping more than one line at a time in memory.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top