Your question wasn't clear about exactly which dates you were missing; I'm just assuming that you want to fill NaN
for any date for which you do have an observation elsewhere. My solution will have to be amended if this assumption is faulty.
Side note: it may be nice to include a line to create the DataFrame
In [55]: df = pd.DataFrame({'A': ['loc_a'] * 12 + ['loc_b'],
....: 'B': ['group_a'] * 7 + ['group_b'] * 3 + ['group_c'] * 2 + ['group_a'],
....: 'Date': ["2013-06-11",
....: "2013-07-02",
....: "2013-07-09",
....: "2013-07-30",
....: "2013-08-06",
....: "2013-09-03",
....: "2013-10-01",
....: "2013-07-09",
....: "2013-08-06",
....: "2013-09-03",
....: "2013-07-09",
....: "2013-09-03",
....: "2013-10-01"],
....: 'Value': [22, 35, 14, 9, 4, 40, 18, 4, 2, 5, 1, 2, 3]})
In [56]:
In [56]: df.Date = pd.to_datetime(df.Date)
In [57]: df = df.set_index(['A', 'B', 'Date'])
In [58]:
In [58]: print(df)
Value
A B Date
loc_a group_a 2013-06-11 22
2013-07-02 35
2013-07-09 14
2013-07-30 9
2013-08-06 4
2013-09-03 40
2013-10-01 18
group_b 2013-07-09 4
2013-08-06 2
2013-09-03 5
group_c 2013-07-09 1
2013-09-03 2
loc_b group_a 2013-10-01 3
To get the unobserved values filled, we'll use the unstack
and stack
methods. Unstacking will create the NaN
s we're interested in, and then we'll stack them up to work with.
In [71]: df.unstack(['A', 'B'])
Out[71]:
Value
A loc_a loc_b
B group_a group_b group_c group_a
Date
2013-06-11 22 NaN NaN NaN
2013-07-02 35 NaN NaN NaN
2013-07-09 14 4 1 NaN
2013-07-30 9 NaN NaN NaN
2013-08-06 4 2 NaN NaN
2013-09-03 40 5 2 NaN
2013-10-01 18 NaN NaN 3
In [59]: df.unstack(['A', 'B']).fillna(0).stack(['A', 'B'])
Out[59]:
Value
Date A B
2013-06-11 loc_a group_a 22
group_b 0
group_c 0
loc_b group_a 0
2013-07-02 loc_a group_a 35
group_b 0
group_c 0
loc_b group_a 0
2013-07-09 loc_a group_a 14
group_b 4
group_c 1
loc_b group_a 0
2013-07-30 loc_a group_a 9
group_b 0
group_c 0
loc_b group_a 0
2013-08-06 loc_a group_a 4
group_b 2
group_c 0
loc_b group_a 0
2013-09-03 loc_a group_a 40
group_b 5
group_c 2
loc_b group_a 0
2013-10-01 loc_a group_a 18
group_b 0
group_c 0
loc_b group_a 3
Reorder the index levels as necessary.
I had to slip that fillna(0)
in the middle there so that the NaN
s weren't dropped. stack
does have a dropna
argument. I would think that setting that to false would keep the all NaN
rows around. A bug maybe?