identify start and end terms of NA sections

https://stackoverflow.com/questions/7975042

19-02-2021
|

Question

Let's say I have some data in R that looks like this:

c(0.11, NA, NA, NA, 2.76, 3.65, NA, NA, NA, NA, 1.56)

How might I efficiently extract the start and end terms of each "block" of NA values? If the result were a data frame, I would want it to look something like this:

  first.na last.na
1        2       4
2        7      10

I'm trying to train myself to avoid for loops since I'll be doing this type of operation on very large datasets (on the order of 1e9 terms), and na.omit isn't quite helpful.

Solution

Maybe there is a function to do that work, but you can do by:

> z <- c(0.11, NA, NA, NA, 2.76, 3.65, NA, NA, NA, NA, 6)

> z2 <- diff(is.na(c(0, z, 0)))
> data.frame(first.na = which(z2 == 1), last.na = which(z2 == -1)-1)
  first.na last.na
1        2       4
2        7      10

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow