Domanda

So I have a few csv files in in the following format:

person,age,nationality,language
Jack,18,Canadian,English
Rahul,25,Indian,Hindi
Mark,50,American,English
Kyou, 21, Japanese, English

I need to import that, and return that data as a dictionary, with the keys as the column headings in the first row, and all the data in each column as values for that specific key. For example:

dict = {
    'person': ['Jack', 'Rahul', 'Mark', 'Kyou'],
    'age': [18, 25, 50, 21],
    'nationality': ['Canadian', 'Indian', 'American', 'Japanese'],
    'language': ['English', 'Hindi', 'English', 'English']
}

Any idea how I would begin this code and make it so that the code would work for any number of columns given in a .csv file?

È stato utile?

Soluzione 3

Here is a fairly straightforward solution that uses the python CSV module (DOCs here: http://docs.python.org/2/library/csv.html). Just replace 'csv_data.csv' with the name of you CSV file.

import csv

with open('csv_data.csv') as csv_data:
    reader = csv.reader(csv_data)

    # eliminate blank rows if they exist
    rows = [row for row in reader if row]
    headings = rows[0] # get headings

    person_info = {}
    for row in rows[1:]:
        # append the dataitem to the end of the dictionary entry
        # set the default value of [] if this key has not been seen
        for col_header, data_column in zip(headings, row):
            person_info.setdefault(col_header, []).append(data_column)

    print person_info

Altri suggerimenti

I'd go for something like:

import csv

with open('input') as fin:
    csvin = csv.reader(fin)
    header = next(csvin, [])
    print dict(zip(header, zip(*csvin)))

# {'person': ('Jack', 'Rahul', 'Mark', 'Kyou'), 'age': ('18', '25', '50', ' 21'), 'language': ('English', 'Hindi', 'English', ' English'), 'nationality': ('Canadian', 'Indian', 'American', ' Japanese')}

Adapt accordingly.

Using the csv module, I would do it this way:

with open('somefile.csv', 'rb') as input_file:
    reader = csv.DictReader(input_file)
    results = {}
    for linedict in reader:
        for (key, value) in linedict.iteritems():
            results.setdefault(key, []).append(value)

You could use zipping combined with slicing in a dict comprehension, once you've gotten the data in to a list of lists with the csv module.

{col[0] : col[1:] for col in zip(*rows)}
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top