Is there any way to force zoo::rollmean function to return a vector that is the same length as it's input? (or maybe use other function?)

StackOverflow https://stackoverflow.com/questions/4422363

  •  09-10-2019
  •  | 
  •  

Pergunta

input = cbind(c(3,7,3,5,2,9,1,4,6,4,7,3,7,4))
library(zoo)
output = cbind(rollmean(input,4))
print(input)
print(output)

output:

      [,1]
 [1,]    3
 [2,]    7
 [3,]    3
 [4,]    5
 [5,]    2
 [6,]    9
 [7,]    1
 [8,]    4
 [9,]    6
[10,]    4
[11,]    7
[12,]    3
[13,]    7
[14,]    4
      [,1]
 [1,] 4.50
 [2,] 4.25
 [3,] 4.75
 [4,] 4.25
 [5,] 4.00
 [6,] 5.00
 [7,] 3.75
 [8,] 5.25
 [9,] 5.00
[10,] 5.25
[11,] 5.25

but when I try to cbind it:

Error in cbind(input, output) :
  number of rows of matrices must match (see arg 2)
Calls: print -> cbind
Execution halted

I'd like to use a function that would be smart enough and do not give up if it doesn't get data on both ends of a vector and calculating output then according to only the data it is having. so for example in input[1] it will calculate only mean from right

Foi útil?

Solução

Look at the na.pad argument to rollmean(), and set it to TRUE. Missed the last bit; so you need also to align the means to the right:

> input <- c(3,7,3,5,2,9,1,4,6,4,7,3,7,4)
> rollmean(input, 4, na.pad = TRUE, align = "right")
 [1]   NA   NA   NA 4.50 4.25 4.75 4.25 4.00 5.00 3.75 5.25 5.00 5.25 5.25

Unless you need these things as 1-column matrices, drop the cbind() calls.

OK, from further clarifications it appears you want to compute some means that aren't really comparable to the other means in the result vector. But if you must...

> k <- 4
> c( cumsum(input[1:(k-1)]) / 1:(k-1), rollmean(input, k, align = "right") )
 [1] 3.000000 5.000000 4.333333 4.500000 4.250000 4.750000 4.250000 4.000000
 [9] 5.000000 3.750000 5.250000 5.000000 5.250000 5.250000

As the OP is interested in estimating the MA to then fit a spline to it, it might be instructive to see what one gains by doing this instead of estimating the spline directly from the data.

> ## model observed data
> mod <- smooth.spline(seq_along(input), input, df = 3)
> ## plot data and fitted spline
> plot(seq_along(input), input)
> lines(predict(mod, seq_along(input)), col = "red", lwd = 2)
> ## model the fudged MA
> mod2 <- smooth.spline(seq_along(input),
+                       c( cumsum(input[1:(k-1)]) / 1:(k-1),
+                         rollmean(input, k, align = "right") ), df = 3)
> ## add this estimated spline
> lines(predict(mod2, seq_along(input)), col = "blue", lwd = 2)

You'd be hard pushed to tell the difference between these two Comparison of direct smooth and smooth of MA

and the curves deviate most at the beginning where you are forcing estimation of the MA.

Outras dicas

Although this is an old question, for anyone reading this, hope it helps.

Using rollapply with function mean, and partial = TRUE will keep the initial values where function cannot be calculated.

x <- rollapply(input, width = 5, FUN = mean, align = "centre", partial = TRUE")

??rollapply 
??rollapplyr # for right aligned moving average

You would really benefit from reading the documentation. See ?rollmean, specifically the na.pad and align arguments.

So far the question has been seen as ambiguous by three experience R coders, but it seems that you do want some sort of extrapolated value for the missing means. Whether you wanted the imputed values at the beginning or the end remains unclear. This code will return a right-aligned vector and replace the beginning NA's with the first not-NA value. There would also be the na.locf function in zoo if you wanted to work with left-aligned rollmeans.

long.roll <- function(input, k) { rtroll <-  
                           rollmean(input, k, align="right", na.pad=TRUE)
                return(c(rep(rtroll[k], k-1), rtroll[-(1:(k-1))]) ) }
long.roll(input,4)
#  [1] 4.50 4.50 4.50 4.50 4.25 4.75 4.25 4.00 5.00 3.75 5.25 5.00 5.25
# [14] 5.25
Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top