isolating parameters from a file/list

https://stackoverflow.com/questions/13215968

30-07-2021
|

Question

I am trying to write a function that opens a file with a list of names, cities, and numbers, and takes an average of the numbers by city.

I have so far something like:

numbers = 0
count = 0
n = 0
while n < len(file):
    for item in file:
        if item.split(' ')[-2] == city:
            count += 1
            numbers += float(item.split(' ')[-1])
            n += 1
        else:
            n += 1
    return numbers / count

Where [-2] is the position of the city, [-1] is the position of the number. Assuming the file is already open.

My code runs through the whole file, and returns only whatever is on the last line. So if the last line in the file has someone from London, and I'm trying to average for London, it will just give me that one number; if I'm trying to average for some other city, it will return nothing.

Why does it loop through the whole file without updating my counts, and how do I fix it?

Edit:

Edited the code, and the file looks like:

NAME1     COUNTRY     CITY     NUMBER

on each line.

Solution

First of all item.split(' ') produces something like this:

['foo', '', '', '', '', 'spam', '', 'foo', '', '666']

if you have multiple spaces in line. Use item.split()

Second, file object has iteration interface, so it's possible to iterate over file lines this way:

for line in open('city.dat'):
    data = line.split()
    if data[-2] == 'CITYNAME':
       count += 1
       numbers += float(data[-1])

Third, ensure that 'CITYNAME' exists in your file

To prevent unnecessary readings from file and splitings it's better to store prepared data in memory:

data = map(lambda x: x.split(), open('city.dat'))

and filter it if needed:

filtered_cities = filter(lambda x: x[-2] == 'CITYNAME', data)

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow