Question

My dataset looks like the following (let's call it "a"):

date value
2013-01-01 12.2
2013-01-02 NA
2013-01-03 NA
2013-01-04 16.8
2013-01-05 10.1
2013-01-06 NA
2013-01-07 12.0

I would like to replace the NA by the mean of the closest surroundings values (the previous and the next values in the series).

I tried the following but I am not convinced by the output...

miss.val = which(is.na(a$value))
library(zoo)
z = zoo(a$value, a$date)
z.corr = na.approx(z)
z.corr[(miss.val - 1):(miss.val + 1), ]
Was it helpful?

Solution

Using na.locf (Last Observation Carried Forward) from package zoo:

R> library("zoo")
R> x <- c(12.2, NA, NA, 16.8, 10.1, NA, 12.0)
R> (na.locf(x) + rev(na.locf(rev(x))))/2
[1] 12.20 14.50 14.50 16.80 10.10 11.05 12.00

(does not work if first or last element of x is NA)

OTHER TIPS

You can do exactly this in 1 line of code with the Moving Average na.ma function of the imputeTS package

library(imputeTS)
na_ma(yourData, k = 1)

This replaces the missing values with the mean of the closest surroundings values. You can even additionally set parameters.

na_ma(yourData, k =2, weighting = "simple")

In this case the algorithm would take the next 2 values in each direction. You can also choose different weighting of the values(you might want that values closer have more influence)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top