Trying to find the average of multiple values in a dictionary

https://stackoverflow.com/questions/23451046

15-07-2023
|

Pergunta

New to python here. Trying to get find the average of totaled up keys in a dictionary.

I've managed to grab the total of all values for each key, but I'm not certain how to find the average of these new values.

import os

f = open("iris.data", "r")
count = 0
d = {}
# You want the dictionary to have d = {Iris-setosa: list of values]
# Populate dictionary
# Code would work regardless of file
# Reference movie names file

for line in f:
    if line.isspace():
        continue #If a particular line is made of just spaces, ignore it
    strip = line.strip("\n") #Strips out that particular string
    data = strip.split(",")
    name = data[4]

    if name in d:
        for i in range(len(data)-1):
            d[name][i] += float(data[i])
        d[name][4] += 1 #increment count
        count += 1
    else:
        d[name] = [float(i) for i in data[0:4]]
        d[name].append(1) #keep count
        count = 1
print(d)

f.close

Solução

You already store a count with each list of values; simply loop over the dictionary items (key-value) pairs and use the last element of the value lists:

for key, values in d.items():
    avg = [v / values[-1] for v in values[:-1]]
    print(key, *avg)

This uses the last element of each values list as the row count, and uses a list comprehension to produce an average value for each of your columns.

Some other remarks:

you evidently have a CSV file; consider using the csv module instead.
you never called the f.close() method; you merely referenced it.

However, you should consider using the file object as a context manager instead. The with statement ensures that the file is closed for you when the block is exited:
```
with open("iris.data", "r") as f:
    for line in f:
        line = line.rstrip('\n')
        if not line:
            continue
```

Outras dicas

You can convert the values to a list, then sum, and divide by the length of the list:

values = d.values()
average_value = sum(values) / len(values)

Since the values are lists, perhaps you can get the average for each list:

for v in d.values():
    average_value = sum(v) / len(v)

Licenciado em: CC-BY-SA com atribuição

Não afiliado a StackOverflow