Question

I have some tick data in this form:

date                 price    amount
2011-09-13 13:53:36  5.80     1.0000
2011-09-13 13:53:44  5.83     3.0000
2011-09-13 14:32:53  5.90     2.0000

And I've resampled the price with:

resampledData.price.resample('55min', how="ohlc")

Now I need to fill out the missing data and the only way I came up with was:

closes = resampledData.close
closes = closes.fillna(method='pad')
resampledData = resampledData.open.fillna(closes)
resampledData = resampledData.high.fillna(closes)
resampledData = resampledData.lowe.fillna(closes)
resampledData = resampledData.close.fillna(closes)

But this looks really bad, is there a more efficient way to do it?

Also, is there a way to ressample two columns at the same time? Like the price with "ohlc" and the amount with a method "sum"?

Was it helpful?

Solution

For filling the NaNs, you can apply this on all columns in one line as follows:

closes = resampledData['close'].fillna(method='pad')
resampledData.apply(lambda x: x.fillna(closes))

You also have a fillna method for a DataFrame (http://pandas.pydata.org/pandas-docs/dev/generated/pandas.DataFrame.fillna.html), but this regards a Series as input as different values to use for different columns.

For the resampling, normally you can do the following to resample with different functions on multiple columns at once:

resampledData.resample('55min', how={'price':'ohlc', 'amount':'sum'})

But for me this does not seem to work with ohlc (if you change that to eg 'mean', then it does work).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top