Pregunta

I am working with a hourly time series (Date, Time (hr), P) and trying to calculate the proportion of daily total 'Amount' for each hour. I know I can us Pandas' resample('D', how='sum') to calculate the daily sum of P (DailyP) but in the same step, I would like to use the daily P to calculate proportion of daily P in each hour (so, P/DailyP) to end up with an hourly time series (i.e., same frequency as original). I am not sure if this can even be called 'resampling' in Pandas term. This is probably apparent from my use of terminology, but I am an absolute newbie at Python or programming for that matter. If anyone can suggest a way to do this, I would really appreciate it. Thanks!

¿Fue útil?

Solución

A possible approach is to reindex the daily sums back to the original hourly index (reindex) and filling the values forward (so that every hour gets the value of the sum of that day, fillna):

df.resample('D', how='sum').reindex(df.index).fillna(method="ffill")

And this you can use to divide your original dataframe with.

An example:

>>> import pandas as pd
>>> import numpy as np
>>> 
>>> df = pd.DataFrame({'P' : np.random.rand(72)}, index=pd.date_range('2013-05-05', periods=72, freq='h'))
>>> df.resample('D', 'sum').reindex(df.index).fillna(method="pad")
                             P
2013-05-05 00:00:00  14.049649
2013-05-05 01:00:00  14.049649
...
2013-05-05 22:00:00  14.049649
2013-05-05 23:00:00  14.049649
2013-05-06 00:00:00  13.483974
2013-05-06 01:00:00  13.483974
...
2013-05-06 23:00:00  13.483974
2013-05-07 00:00:00  12.693711
2013-05-07 01:00:00  12.693711
..
2013-05-07 22:00:00  12.693711
2013-05-07 23:00:00  12.693711
Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top