Question

I have two csv files. EMPLOYEES contains a dict of every employee at a company with 10 rows of information about each one. SOCIAL contains a dict of employees who filled out a survey, with 8 rows of information. Every employee in survey is also on the master dict. Both dicts have a unique identifier (the EXTENSION.)

I want to say "If an employee is on the SOCIAL dict, add rows 4,5,6 to their column in the EMPLOYEES dict" In other words, if an employee filled out a survey, additional information should be appended to the master dict.

Currently, my program pulls out all information from EMPLOYEES for employees who have taken the SURVEY. But I don't know how to add the additional rows of information to the EMPLOYEES csv. I have spent much of the day reading StackOverflow about DictReader and Dictionary and am still confused.

Thank you in advance for your guidance.

Sample EMPLOYEE:

Name  Extension   Job
Bill  1111        plumber
Alice 2222        fisherman
Carl  3333        rodeo clown

Sample SURVEY:

Extension   Favorite Color    Book
 2222          blue          A Secret Garden
 3333          green         To Kill a Mockingbird

Sample OUTPUT

Name  Extension   Job           Favorite Color     Favorite Book
Bill  1111        plumber
Alice 2222        fisherman         blue             A Secret Garden
Carl  3333        rodeo clown       green            To Kill a Mockingbird


import csv

with open('employees.csv', "rU") as npr_employees:
   employees = csv.DictReader(npr_employees)
   all_employees = {}
   total_employees = {}
   for employee in employees:
       all_employees[employee['Extension']] = employee

with open('social.csv', "rU") as social_employees:
   social_employee = csv.DictReader(social_employees) 
   for row in social_employee:
       print all_employees.get(row['Extension'], None)
Was it helpful?

Solution

You can merge two dictionaries in Python using:

dict(d1.items() + d2.items())

Using a dict, all_employees, with the key as 'Extension' works perfectly to link a "social employee" row with its corresponding "employee" row.

Then you need to go through all the updated employee info and output their fields in a consistent order. Since dictionaries are inherently orderless, we keep a list of the headers, output_headers as we see them.

import csv

# Store all the info about the employees
all_employees = {}
output_headers = []

# First, get all employee record info
with open('employees.csv', 'rU') as npr_employees:
    employees = csv.DictReader(npr_employees)
    for employee in employees:
        ext = employee['Extension']
        all_employees[ext] = employee
    # Add headers from "all employees"
    output_headers.extend(employees.fieldnames)

# Then, get all info from social, and update employee info
with open('social.csv', 'rU') as social_employees:
    social_employees = csv.DictReader(social_employees) 
    for social_employee in social_employees:
        ext = social_employee['Extension']

        # Combine the two dictionaries.
        all_employees[ext] = dict(
                all_employees[ext].items() + social_employee.items()
        )

    # Add headers from "social employees", but don't add duplicate fields
    output_headers.extend(
            [field for field in social_employees.fieldnames
            if field not in output_headers]
    )

# Finally, output the records ordered by extension
with open('output.csv', 'wb') as f:
    writer = csv.writer(f)
    writer.writerow(output_headers)

    # Write the new employee rows.  If a field doesn't exist, 
    # write an empty string.
    for employee in sorted(all_employees.values()):
        writer.writerow(
                [employee.get(field, '') for field in output_headers]
        )

outputs:

Name,Extension,Job,Favorite Color,Book
Bill,1111,plumber,,
Alice,2222,fisherman,blue,A Secret Garden
Carl,3333,rodeo clown,green,To Kill a Mockingbird

Let me know if you have any questions!

OTHER TIPS

You Could try:

for row in social_employee:
    employee = all_employees.get(row['Extension'], None)
    if employee is not None:
        all_employees[employee['additionalinfo1']] = row['additionalinfo1']
        all_employees[employee['additionalinfo2']] = row['additionalinfo2']
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top