Frage

Could you help me figuring out why the following doesn't work? I have a 2528x3 matrix uniqueitems which looks like that:

Number        Created               Customer
===========   ===================   ============
31464686486   2013-10-25 10:00:00   john@john.de
...

What I'd like to do: Go through every row, check if Created is more recent than a given time and, if so, write the row into a new table newerthantable. Here's my code:

library(lubridate);
newerthan <- function(x) {
  times <- ymd_hms(uniqueitems[,2])
  newerthantable <- matrix(data=NA,ncol=3,nrow=1)
  i <- 1;
  while (i <= nrow(uniqueitems)) {
    if (x < times[i]) {
      newerthantable <- rbind(newerthantable,uniqueitems[i,])
    }
    i <- i + 1;
  }
}

But newerthan("2013-10-24 14:00:00") doesn't have the desired effect :(, nothing is written in newerthantable. Why?

War es hilfreich?

Lösung

In R loops are rarely needed. You can achieve the same results using vectorized operations or subsetting as in this case.

Setup sample data frame:

number <- c(1:10)
created <- seq(as.POSIXct("2013-01-01 10:01"), length.out=10, by="26 hours")
customer <- letters[c(1:10)]
df <- data.frame(number, created, customer)

head(df, 10)

   number             created customer
1       1 2013-01-01 10:01:00        a
2       2 2013-01-02 12:01:00        b
3       3 2013-01-03 14:01:00        c
4       4 2013-01-04 16:01:00        d
5       5 2013-01-05 18:01:00        e
6       6 2013-01-06 20:01:00        f
7       7 2013-01-07 22:01:00        g
8       8 2013-01-09 00:01:00        h
9       9 2013-01-10 02:01:00        i
10     10 2013-01-11 04:01:00        j

Select rows newer than a given date:

newerthantable <- df[df$created > as.POSIXct("2013-01-05 18:01:00"), ]

head(newerthantable,10)

   number             created customer
6       6 2013-01-06 20:01:00        f
7       7 2013-01-07 22:01:00        g
8       8 2013-01-09 00:01:00        h
9       9 2013-01-10 02:01:00        i
10     10 2013-01-11 04:01:00        j

The square brackets select rows matching our criteria (created column larger than a given date) and all columns (no column specification after the comma). Read more about subsetting operations here: http://www.ats.ucla.edu/stat/r/modules/subsetting.htm

If you want to wrap it up as a function it will look like this:

new_entries <- function(data, rows_since){

  data[data$created > as.POSIXct(rows_since), ]

}

new_entries(df, "2013-01-05 18:01:00")
Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top