Question

I have a data.frame as follows:

Dat1 <- data.frame(dateTime = as.POSIXct(c("2012-05-03 00:00","2012-05-03 02:00",
                                           "2012-05-03 02:30","2012-05-03 05:00",
                                           "2012-05-03 07:00","2012-05-04 07:00"), 
                                         tz = 'UTC'),x1 = rnorm(6))

giving:

> Dat1
             dateTime         x1
1 2012-05-03 00:00:00 -0.3529501
2 2012-05-03 02:00:00  1.9086742
3 2012-05-03 02:30:00 -0.4707939
4 2012-05-03 05:00:00 -1.7001035
5 2012-05-03 07:00:00 -1.3389383
6 2012-05-04 07:00:00  0.6985237

I would like to reduce this data.frame to only contain the rows that have more than n points for a given day. So, if I were to say that n = 2, Dat1 should reduce to:

> Dat1
             dateTime         x1
1 2012-05-03 00:00:00 -0.3529501
2 2012-05-03 02:00:00  1.9086742
3 2012-05-03 02:30:00 -0.4707939
4 2012-05-03 05:00:00 -1.7001035
5 2012-05-03 07:00:00 -1.3389383

I would like this to be useful for a data.frame of any number of columns i.e. not just for this example.

Was it helpful?

Solution

A straightforward approach would be to use as.Date and table to figure out the number of data points per day. The manual solution might look like:

n <- 2
Dat1[as.character(as.Date(Dat1$dateTime)) %in% 
       names(which(table(as.Date(Dat1$dateTime)) >= n)), ]

Using that, you can also create a basic function, if this is something you want to do often and if you want to change certain parameters. Here's a quick function to try out:

DateThreshold <- function(input, datevar, threshold) {
  datevar <- as.character(as.Date(input[[datevar]]))
  datevar.tab <- names(which(table(datevar) >= threshold))
  input[datevar %in% datevar.tab, ]
}

Usage with your example data would be like this:

DateThreshold(Dat1, "dateTime", 2)
#              dateTime          x1
# 1 2012-05-03 00:00:00 -0.36532709
# 2 2012-05-03 02:00:00 -0.52474466
# 3 2012-05-03 02:30:00 -0.06044233
# 4 2012-05-03 05:00:00  0.51963463
# 5 2012-05-03 07:00:00 -0.34407808
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top