Question

I would like to exclude those days when x2 equals zero more than a predetermined number of times (i.e. > 300 during the same day):

library(xts)
set.seed(1)
tmp <- seq(as.POSIXct('2013-09-03 00:00:01'),
           as.POSIXct('2013-09-06 23:59:59'), by='min')
x1 <- rnorm(length(tmp))
x2 <- rnorm(length(tmp))
x2 [1:400] <- 0

x <- xts(cbind(x1, x2), tmp)

I've found .indexday function to subset within days so one possibility is to write a for loop that subsets by day and calculates the number of elements on x2that are equal to zero but I'm sure that there is a more efficient way of doing it.

The output would be the same object x without those days in which there are more than 300 cases with x2 == 0.

Was it helpful?

Solution 2

Here is a solution:

##split idx object with respect to days
aa <- split.xts(x, f="days")

## get indices of days for which x2 == 0 less than 300 times
idx <- which(lapply(aa, function(xx){length(which(xx[,"x2"]==0))}) <= 300)

idx
[1] 2 3 4

##make one xts object containing only the desired days
new.x <- do.call(rbind, aa[idx])

dim(x)
[1] 5760    2

dim(new.x)
[1] 4320    2

OTHER TIPS

Whatever solution you use, you need to be careful with timezones when converting from POSIXt to Date. Here's a solution using ave:

> x <- xts(cbind(x1, x2), tmp, tzone="UTC")
> y <- x[ave(x$x2==0, as.Date(index(x)), FUN=sum) < 300,]
> head(y)
                            x1         x2
2013-09-04 00:00:01  0.6855122  0.8171146
2013-09-04 00:01:01  0.3895035  0.1818066
2013-09-04 00:02:01 -1.3053959  1.2532384
2013-09-04 00:03:01  1.2168880  0.6069871
2013-09-04 00:04:01  0.7951740  0.2825354
2013-09-04 00:05:01 -0.4882025 -0.3089424
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top