After searching for so long, I found this straight forward method from xts
package
obs[.indexhour(x) %in% c(t1,t2)]
this extracts all observation of t1
and t2
hours on each day. For more details try ?indexClass
in xts
package
Pergunta
I have a long time series 'obs' with 1 hour time step (class="zoo").There were some missing values which has already been removed, so the time step is not consistent anymore
> head(obs)
time obs
2009-12-22 01:00:00 23.708
2009-12-22 02:00:00 23.708
2009-12-22 03:00:00 23.708
2009-12-22 04:00:00 23.708
2009-12-22 06:00:00 23.708
2009-12-22 07:00:00 23.708
> tail(obs)
time obs
2013-09-22 21:00:00 45.031
2013-09-22 22:00:00 45.031
2013-09-22 23:00:00 41.589
2013-09-23 00:00:00 28.987
2013-09-23 01:00:00 22.238
2013-09-23 02:00:00 20.533
Now from this time series I want to create multiple time series with a time step of 12 hours starting from each hours. so in total there should be 12 time series. one of the expected output is given below (which starts at 01:00:00)
time obs
2009-12-22 01:00:00 23.708
2009-12-22 13:00:00 23.708
2009-12-23 01:00:00 23.708
2009-12-23 13:00:00 24.136
2009-12-24 01:00:00 23.708
2009-12-24 13:00:00 23.708
....
Like this I need to create other time series (starts from 02:00:00, 03:00:00 and so on) with 12 hour time step. If the time step is consistent I can transfrom every 12 hour data in rows and then it would be much easier to extract it from each column. But it's not possible now. How can I do it? I am already using xts package. But I couldn't find a way.
Solução 3
After searching for so long, I found this straight forward method from xts
package
obs[.indexhour(x) %in% c(t1,t2)]
this extracts all observation of t1
and t2
hours on each day. For more details try ?indexClass
in xts
package
Outras dicas
xts is the right package. What you are interested in is the function
[.xts (Extract subsets of xts Objects)
For example:
obs["T01:00/T01:59"]
will return all the observation where the "T" time is between 01:00 and 01:59.
You just need to vectorize, and putting all together you could get something similar to this:
my_func <- function(i, obs){
if(i > 9){
hours <- paste("T", i, ":00/T", i, ":59", sep = "")
}else{
hours <- paste("T0", i, ":00/T0", i, ":59", sep = "")
}
hours.12 <- paste("T", i + 12, ":00/T", i + 12, ":59", sep = "")
#
obs.subset <- rbind(obs[hours], obs[hours.12])
}
# get a list of 12 subsets as requested
obs.subsetted <- lapply(0:11, my_func, obs)
Here is a solution using data.table
and lubridate
.
The entire code snippet takes less than 0.01 seconds on my laptop.
# Load packages
library(lubridate)
library(data.table)
# Set up data
time <- seq(ymd_hms("2009-12-22 01:00:00"), ymd_hms("2013-09-23 02:00:00"), by="1 hour")
obs <- abs(rnorm(length(time)))
dt <- data.table(time, obs)
# Set up a list where all 12 output data tables are stored
l <- vector(12, mode="list")
# Split original data
for (i in 0:11){
l[[i+1]] <- dt[seq(from=i+1, to=nrow(dt), by=12)]
}
The output data looks like this:
> l
[[1]]
time obs
1: 2009-12-22 01:00:00 1.14244266
2: 2009-12-22 13:00:00 1.13037973
3: 2009-12-23 01:00:00 0.18268572
4: 2009-12-23 13:00:00 0.56539405
5: 2009-12-24 01:00:00 0.06480253
---
2739: 2013-09-21 01:00:00 1.06874026
2740: 2013-09-21 13:00:00 0.04367871
2741: 2013-09-22 01:00:00 0.43790836
2742: 2013-09-22 13:00:00 1.41966787
2743: 2013-09-23 01:00:00 0.68687465
[[2]]
time obs
1: 2009-12-22 02:00:00 1.6789682
2: 2009-12-22 14:00:00 0.1321111
3: 2009-12-23 02:00:00 2.5129179
4: 2009-12-23 14:00:00 0.9818898
5: 2009-12-24 02:00:00 0.6617939
---
2739: 2013-09-21 02:00:00 0.6028943
2740: 2013-09-21 14:00:00 0.4571396
2741: 2013-09-22 02:00:00 0.7017483
2742: 2013-09-22 14:00:00 0.1206088
2743: 2013-09-23 02:00:00 0.3864518
[[3]]
time obs
1: 2009-12-22 03:00:00 2.14461926
2: 2009-12-22 15:00:00 0.68896644
3: 2009-12-23 03:00:00 0.19332982
4: 2009-12-23 15:00:00 1.09463684
5: 2009-12-24 03:00:00 0.60102308
---
2738: 2013-09-20 15:00:00 0.36922591
2739: 2013-09-21 03:00:00 0.89973806
2740: 2013-09-21 15:00:00 0.02761852
2741: 2013-09-22 03:00:00 0.17313669
2742: 2013-09-22 15:00:00 0.61018630
[[4]]
...