Python Pandas - replace values with NAN in multiple columns based on mutliple dates?

Question 1

Maybe also not that fast, but already a cleaner approach based on pandas:

df.where(df.apply(lambda x: x.index < pd.Timestamp(x.name[2])))

The apply returns a dataframe with True/False values (the < expression is evaluated for each column where x.name[2] selects the third level of that column name), and the where replaces the False values with NaN.

Full example:

In [1]: import pandas as pd

In [2]: from StringIO import StringIO

In [3]: s = """,ACTION,ACTION
   ...: ,111,222
   ...: ,1/7/2010,1/5/2010
   ...: DATE,,
   ...: 1/1/2010,    10,                          5
   ...: 1/2/2010,    10,                          5
   ...: 1/3/2010,    10,                          5
   ...: 1/4/2010,    15,                          5
   ...: 1/5/2010,    10,                          5
   ...: 1/6/2010,    10,                          5
   ...: 1/7/2010,    10,                          5
   ...: 1/8/2010,    10,                          5"""

In [4]: df = pd.read_csv(StringIO(s), header=[0,1,2], index_col=0, parse_dates=True)

In [5]: df.where(df.apply(lambda x: x.index < pd.Timestamp(x.name[2])))
Out[5]:
              ACTION
                 111       222
            1/7/2010  1/5/2010
DATE
2010-01-01        10         5
2010-01-02        10         5
2010-01-03        10         5
2010-01-04        15         5
2010-01-05        10       NaN
2010-01-06        10       NaN
2010-01-07       NaN       NaN
2010-01-08       NaN       NaN

Question 2

I am sure there may be better way to do this, but three lines would do the job

In [194]:

A=(np.array(pd.to_datetime(df['DATE']))[...,np.newaxis]+12*60*12*10**10)>\
   np.array([np.datetime64(pd.to_datetime(item[-1])) for item in df.columns.tolist()[1:]])
B=np.hstack((np.ones(len(df)).reshape((-1,1))!=1, A))
print df.where(~B)

#       DATE  (ACTION, 111, 1/7/2010)  (ACTION, 222, 1/5/2010)
#0  1/1/2010                       10                        5
#1  1/2/2010                       10                        5
#2  1/3/2010                       10                        5
#3  1/4/2010                       15                        5
#4  1/5/2010                       10                      NaN
#5  1/6/2010                       10                      NaN
#6  1/7/2010                      NaN                      NaN
#7  1/8/2010                      NaN                      NaN

#[8 rows x 3 columns]

I assume your DATE column is stored as string and the last item in each tuple in your column names is also stored in string. If both are the case, you will need the conversions in the first line, otherwise you may skip some.

Edit: It runs quire slow, 100 loops, best of 3: 4.55 ms per loop.