You can merge two dictionaries in Python using:
dict(d1.items() + d2.items())
Using a dict, all_employees
, with the key as 'Extension' works perfectly to link a "social employee" row with its corresponding "employee" row.
Then you need to go through all the updated employee info and output their fields in a consistent order. Since dictionaries are inherently orderless, we keep a list of the headers, output_headers
as we see them.
import csv
# Store all the info about the employees
all_employees = {}
output_headers = []
# First, get all employee record info
with open('employees.csv', 'rU') as npr_employees:
employees = csv.DictReader(npr_employees)
for employee in employees:
ext = employee['Extension']
all_employees[ext] = employee
# Add headers from "all employees"
output_headers.extend(employees.fieldnames)
# Then, get all info from social, and update employee info
with open('social.csv', 'rU') as social_employees:
social_employees = csv.DictReader(social_employees)
for social_employee in social_employees:
ext = social_employee['Extension']
# Combine the two dictionaries.
all_employees[ext] = dict(
all_employees[ext].items() + social_employee.items()
)
# Add headers from "social employees", but don't add duplicate fields
output_headers.extend(
[field for field in social_employees.fieldnames
if field not in output_headers]
)
# Finally, output the records ordered by extension
with open('output.csv', 'wb') as f:
writer = csv.writer(f)
writer.writerow(output_headers)
# Write the new employee rows. If a field doesn't exist,
# write an empty string.
for employee in sorted(all_employees.values()):
writer.writerow(
[employee.get(field, '') for field in output_headers]
)
outputs:
Name,Extension,Job,Favorite Color,Book
Bill,1111,plumber,,
Alice,2222,fisherman,blue,A Secret Garden
Carl,3333,rodeo clown,green,To Kill a Mockingbird
Let me know if you have any questions!