Maybe also not that fast, but already a cleaner approach based on pandas:
df.where(df.apply(lambda x: x.index < pd.Timestamp(x.name[2])))
The apply
returns a dataframe with True/False values (the <
expression is evaluated for each column where x.name[2]
selects the third level of that column name), and the where replaces the False values with NaN.
Full example:
In [1]: import pandas as pd
In [2]: from StringIO import StringIO
In [3]: s = """,ACTION,ACTION
...: ,111,222
...: ,1/7/2010,1/5/2010
...: DATE,,
...: 1/1/2010, 10, 5
...: 1/2/2010, 10, 5
...: 1/3/2010, 10, 5
...: 1/4/2010, 15, 5
...: 1/5/2010, 10, 5
...: 1/6/2010, 10, 5
...: 1/7/2010, 10, 5
...: 1/8/2010, 10, 5"""
In [4]: df = pd.read_csv(StringIO(s), header=[0,1,2], index_col=0, parse_dates=True)
In [5]: df.where(df.apply(lambda x: x.index < pd.Timestamp(x.name[2])))
Out[5]:
ACTION
111 222
1/7/2010 1/5/2010
DATE
2010-01-01 10 5
2010-01-02 10 5
2010-01-03 10 5
2010-01-04 15 5
2010-01-05 10 NaN
2010-01-06 10 NaN
2010-01-07 NaN NaN
2010-01-08 NaN NaN