I ran into a similar problem recently. I solved it by first removing duplicates from df2
. Doing it this way makes you think about which one to keep and which to discard. Unfortunately, pandas doesn't seem to have a great way to remove duplicates based on duplicate index entries, but this workaround (adding an 'index' column to df2
) should do it:
>>> df2['index'] = df2.index
>>> df3 = df2.drop_duplicates(cols='index', take_last=True).reindex(df.index, method='ffill')
>>> del df3['index']
>>> df3
a
2013-02-21 09:51:56.615338 NaN
2013-02-22 09:51:56.615357 3
Of course you could set 'take_last=False' to get a value of 2 for the a column.
I noticed that you said "I wish to match the df2 time with the closest last time in df, which is the first row". I didn't quite understand this statement. The closest times in df to the time in df2 is the second row, not the first row. If I misunderstood your question, let me know and I'll update this answer.
For reference, here is my test data:
>>> df
a
2013-02-21 09:51:56.615338 1
2013-02-22 09:51:56.615357 2
>>> df2
a
2013-02-21 09:51:57.802331 2
2013-02-21 09:51:57.802331 3