Вопрос

Trying to plot multiple lines on one graph using matplotlib and for loops, but the code doesn't work after the first iteration. Here's the code:

import csv
import matplotlib.pyplot as plt
r = csv.reader(open('CrimeStatebyState.csv', 'rb'))
line1 = r.next()

def crime_rate(*state):
    for s in state:
        orig_dict = {}
        for n in range (1960,2006):
            orig_dict[n] = []
        for line in r:
            if line[0] == s:
                orig_dict[int(line[3])].append(int(line[4]))
        for y in orig_dict:
            orig_dict[y] = sum(orig_dict[y])
        plt.plot(orig_dict.keys(), orig_dict.values(),'r')
        print orig_dict.values()
        print s

crime_rate("Alabama", "California", "New York")

Here's what it returns:

[39920, 38105, 41112, 44636, 53550, 55131, 61838, 65527, 71285, 75090, 85399, 86919, 84047, 91389, 107314, 125497, 139573, 136995, 147389, 159950, 190511, 191834, 182701, 162361, 155691, 158513, 173807, 181751, 188261, 190573, 198604, 219400, 217889, 204274, 206859, 206188, 205962, 211188, 200065, 192819, 202159, 192835, 200331, 201572, 201664, 197071]
Alabama
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
California
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
New York
**[[[Graph of Alabama's values]]]**

Why am I getting zeroes after the loop runs once? Is this why the other two graphs aren't showing up? Is there an issue with the sum function, the "for line in r" loop, or using *state?

Sorry if that's not enough information! Thanks to those kind/knowledgeable enough for helping.

Это было полезно?

Решение

It would appear that your csv reader is exhausted after you have processed the first state and therefore when you next call "for line in r:" on the next state there are no more lines to look at. You can confirm this by putting a print statement straight after it to see what it has to process e.g.

for line in r:
    print "test" # Test print
    if line[0] == s:
        orig_dict[int(line[3])].append(int(line[4]))

If you re-define your csv reader within each state loop you should get your data correctly processed:

import csv
import matplotlib.pyplot as plt


def crime_rate(*state):
    for s in state:
        r = csv.reader(open('CrimeStatebyState.csv', 'rb'))
        line1 = r.next()
        orig_dict = {}
        for n in range (1960,2006):
            orig_dict[n] = []
        for line in r:
            if line[0] == s:
                orig_dict[int(line[3])].append(int(line[4]))
        for y in orig_dict:
            orig_dict[y] = sum(orig_dict[y])
        plt.plot(orig_dict.keys(), orig_dict.values(),'r')
        print orig_dict.values()
        print s

crime_rate("Alabama", "California", "New York")

Другие советы

Others have already explained the source of your error. May I suggest you use pandas for this task:

import pandas as pd

states = ["Alabama", "California", "New York"]
data = pd.read_csv('CrimeStatebyState.csv')               # import data
df = data[(1996 <= data.Year) & (data.Year <= 2005)]      # filter by year
pd.pivot_table(df, rows='Year', cols='State', values='Count')[states].plot()

enter image description here

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top