Question

library("xts")
data1<- cbind(a = c(1,2,3,4,5,6,5,4,3,4,5,6,5,4,3,5),
              b = c(1,2,3,4,5,6,5,4,3,4,5,6,5,4,3,5),
              c = c(1,2,3,4,5,6,5,4,5,4,5,4,5,4,5,2),
              d = c(1,2,3,4,5,6,5,4,1,1,1,1,1,2,3,2))
data<- xts(data1, Sys.Date() - (16:1))

data

           a b c d
2013-07-09 1 1 1 1
2013-07-10 2 2 2 2
2013-07-11 3 3 3 3
2013-07-12 4 4 4 4
2013-07-13 5 5 5 5
2013-07-14 6 6 6 6
2013-07-15 5 5 5 5
2013-07-16 4 4 4 4
2013-07-17 5 3 5 1
2013-07-18 4 4 4 1
2013-07-19 5 5 5 1
2013-07-20 4 6 4 1
2013-07-21 5 5 5 1
2013-07-22 4 4 4 2
2013-07-23 3 3 5 3
2013-07-24 5 5 2 2

I have a data set which contains 100 such columns. I need a method or to define a function which can tell me how many such columns are, say above the 5 days SMA (moving average) on a given day. If I give a specific date and 5 days SMA, I should get the number of columns above that SMA and, if possible, the column names too.

Was it helpful?

Solution

You can use whichand then tabulate, order, etc.

all <- which(data>5, arr.ind=TRUE)
table(all[,"row"])
all[order(all[,"row"]),]
split(all, all[,"row"])

EDIT: For the rolling mean, you can calculate the rolling mean first and then procede as mentioned above.

sra <- apply(data, 2, rollmean, k=5)
all <- which(sra>5, arr.ind=TRUE)

EDIT2: You can also get the dates, if you use rownames(all).

table(rownames(all))
split(all, rownames(all))

EDIT3: Apparently I missunderstood the question. The problem with the names comes from the apply function. If you use lapply instead, you get the desired rownames. Then you can cbind it with the data to get NA's for the first and last 2 days.

sra <- do.call(cbind, lapply(data, rollmean, k=5))
sra <- cbind(sra, data)[, 1:ncol(sra)]
all <- which(sra>data, arr.ind=TRUE)
all

EDIT4: Note that rollmean has an align-argument. You apparently want to right-align (default is center).

sra <- do.call(cbind, lapply(data, rollmean, k=5, align="right"))
sra <- cbind(sra, data)[, 1:ncol(sra)]
all <- which(sra>data, arr.ind=TRUE)
all

EDIT 5: If sra is of class xts, it does not have rownames and the matrix all consequently does not either. You can just use as.matrix(sra) to get rownames again. The final line I added is just in case you want to know the names of the columns instead of the number.

sra <- do.call(cbind, lapply(data, rollmean, k=5, align="right"))
sra <- as.matrix(cbind(sra, data)[, 1:ncol(sra)])
all <- which(sra>data, arr.ind=TRUE)
table(rownames(all))
split(all[,"col"], rownames(all))
lapply(split(all[,"col"], rownames(all)), function(x) colnames(data)[x])

EDIT 6: To look at one particular date, save the final list and specify the date and then extract the date from your list. For instance:

lst <- lapply(split(all[,"col"], rownames(all)), function(x) colnames(data)[x])
dat <- "2013-07-23"
lst[dat]
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top