import re
df = df.set_index('county')
df = df.rename(columns=lambda x: re.search(r'([a-zA-Z_]+)(\d{4})', x).groups())
df.columns = MultiIndex.from_tuples(df.columns, names=['label', 'year'])
s = df.unstack()
s.name = 'count'
print(s)
gives
label year county
housingunits 2010 8001 120
8002 100
2012 8001 200
8002 200
occupiedunits 2010 8001 50
8002 75
2012 8001 100
8002 125
Name: count, dtype: int64
If you want that in a DataFrame
call reset_index()
:
print(s.reset_index())
yields
label year county numunits
0 housingunits 2010 8001 120
1 housingunits 2010 8002 100
2 housingunits 2012 8001 200
3 housingunits 2012 8002 200
4 occupiedunits 2010 8001 50
5 occupiedunits 2010 8002 75
6 occupiedunits 2012 8001 100
7 occupiedunits 2012 8002 125