So here's a less succinct version, but it's slightly more in the idiom of pandas:
First pandas.melt
your data. It's easier to work with two DataFrames that are each just a collection of columns with some in common, than it is to try and do MultiIndex acrobatics.
In [127]: dfm = pd.melt(df, var_name=['items', 'labels'], id_vars=['index'], value_name='indicator')
In [128]: dfm
Out[128]:
index items labels indicator
0 2014-04-02 Item0 A 0
1 2014-04-03 Item0 A 0
2 2014-04-04 Item0 A 1
3 2014-04-02 Item0 D 1
4 2014-04-03 Item0 D 1
5 2014-04-04 Item0 D 1
6 2014-04-02 Item1 A 0
7 2014-04-03 Item1 A 0
8 2014-04-04 Item1 A 0
9 2014-04-02 Item1 C 1
10 2014-04-03 Item1 C 1
11 2014-04-04 Item1 C 1
[12 rows x 4 columns]
In [129]: df2m = pd.melt(df2, var_name=['labels'], id_vars=['index'], value_name='value')
In [130]: df2m
Out[130]:
index labels value
0 2014-04-02 A 3
1 2014-04-03 A 1
2 2014-04-04 A -1
3 2014-04-02 B 4
4 2014-04-03 B 3
5 2014-04-04 B -5
6 2014-04-02 C 2
7 2014-04-03 C -2
8 2014-04-04 C 0
9 2014-04-02 D -3
10 2014-04-03 D 1
11 2014-04-04 D -2
[12 rows x 3 columns]
Now you have two frames with some common columns ("labels" and "index") that you can then use in a pandas.merge
:
In [140]: merged = pd.merge(dfm, df2m, on=['labels', 'index'], how='outer')
In [141]: merged
Out[141]:
index items labels indicator value
0 2014-04-02 Item0 A 0 3
1 2014-04-02 Item1 A 0 3
2 2014-04-03 Item0 A 0 1
3 2014-04-03 Item1 A 0 1
4 2014-04-04 Item0 A 1 -1
5 2014-04-04 Item1 A 0 -1
6 2014-04-02 Item0 D 1 -3
7 2014-04-03 Item0 D 1 1
8 2014-04-04 Item0 D 1 -2
9 2014-04-02 Item1 C 1 2
10 2014-04-03 Item1 C 1 -2
11 2014-04-04 Item1 C 1 0
12 2014-04-02 NaN B NaN 4
13 2014-04-03 NaN B NaN 3
14 2014-04-04 NaN B NaN -5
[15 rows x 5 columns]
Since indicator
is really just a boolean indexer, drop its NaN
s and convert it to bool dtype
In [147]: merged.dropna(subset=['indicator'], inplace=True)
In [148]: merged['indicator'] = merged.indicator.copy().astype(bool)
In [149]: merged
Out[149]:
index items labels indicator value
0 2014-04-02 Item0 A False 3
1 2014-04-02 Item1 A False 3
2 2014-04-03 Item0 A False 1
3 2014-04-03 Item1 A False 1
4 2014-04-04 Item0 A True -1
5 2014-04-04 Item1 A False -1
6 2014-04-02 Item0 D True -3
7 2014-04-03 Item0 D True 1
8 2014-04-04 Item0 D True -2
9 2014-04-02 Item1 C True 2
10 2014-04-03 Item1 C True -2
11 2014-04-04 Item1 C True 0
[12 rows x 5 columns]
Now slice with indicator
and use pivot_table
to get your desired result:
In [150]: merged.loc[merged.indicator].pivot_table(values='value', index='index', columns=['items'], aggfunc=sum)
Out[150]:
items Item0 Item1
index
2014-04-02 -3 2
2014-04-03 1 -2
2014-04-04 -3 0
[3 rows x 2 columns]
This may seem like a lot, but that might be because I'm writing out each step. It amounts to about five lines of code.