Question

Let's say I have some data in R that looks like this:

c(0.11, NA, NA, NA, 2.76, 3.65, NA, NA, NA, NA, 1.56)

How might I efficiently extract the start and end terms of each "block" of NA values? If the result were a data frame, I would want it to look something like this:

  first.na last.na
1        2       4
2        7      10

I'm trying to train myself to avoid for loops since I'll be doing this type of operation on very large datasets (on the order of 1e9 terms), and na.omit isn't quite helpful.

Was it helpful?

Solution

Maybe there is a function to do that work, but you can do by:

> z <- c(0.11, NA, NA, NA, 2.76, 3.65, NA, NA, NA, NA, 6)

> z2 <- diff(is.na(c(0, z, 0)))
> data.frame(first.na = which(z2 == 1), last.na = which(z2 == -1)-1)
  first.na last.na
1        2       4
2        7      10
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top