Question

I need to calculate a moving average and standard deviation for a moving window. This is simple enough with the catools package!

... However, what i would like to do, is having defined my moving window, i want to take an average from ONLY those values within the window, whose corresponding values of other variables meet certain criteria. For example, I would like to calculate a moving Temperature average, using only the values within the window (e.g. +/- 2 days), when say Relative Humidity is above 80%.

Could anybody help point me in the right direction? Here is some example data:

da <- data.frame(matrix(c(12,15,12,13,8,20,18,19,20,80,79,91,92,70,94,80,80,90), 
               ncol = 2, byrow = TRUE))

names(da) = c("Temp", "RH") 

Thanks,

Brad

Was it helpful?

Solution

I haven't used catools, but in the help text for the (presumably) most relevant function in that package, ?runmean, you see that x, the input data, can be either "a numeric vector [...] or matrix with n rows". In your case the matrix alternative is most relevant - you wish to calculate mean of a focal variable, Temp, conditional on a second variable, RH, and the function needs access to both variables. However, "[i]f x is a matrix than each column will be processed separately". Thus, I don't think catools can solve your problem. Instead, I would suggest rollapply in the zoo package. In rollapply, you have the argument by.column. Default is TRUE: "If TRUE, FUN is applied to each column separately". However, as explained above we need access to both columns in the function, and set by.column to FALSE.

# First, specify a function to apply to each window: mean of Temp where RH > 80
meanfun <- function(x) mean(x[(x[ , "RH"] > 80), "Temp"])

# Apply the function to windows of size 3 in your data 'da'.
meanTemp <- rollapply(data = da, width = 3, FUN = meanfun, by.column = FALSE)
meanTemp

# If you want to add the means to 'da', 
# you need to make it the same length as number of rows in 'da'.
# This can be acheived by the `fill` argument,
# where we can pad the resulting vector of running means with NA
meanTemp <- rollapply(data = da, width = 3, FUN = meanfun, by.column = FALSE, fill = NA)

# Add the vector of means to the data frame
da2 <- cbind(da, meanTemp)
da2

# even smaller example to make it easier to see how the function works
da <- data.frame(Temp = 1:9, RH = rep(c(80, 81, 80), each = 3))
meanTemp <- rollapply(data = da, width = 3, FUN = meanfun, by.column = FALSE, fill = NA)
da2 <- cbind(da, meanTemp)
da2

#     Temp RH meanTemp
# 1    1 80       NA
# 2    2 80      NaN
# 3    3 80      4.0
# 4    4 81      4.5
# 5    5 81      5.0
# 6    6 81      5.5
# 7    7 80      6.0
# 8    8 80      NaN
# 9    9 80       NA
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top