Question

The hypothetical case is that there exist NA in a data.frame

> a <- c(1:5, NA, 7:10)
> b <- 1:10
> c <- 1:10
> 
> data <- data.frame(a,b,c)
> data
    a  b  c
1   1  1  1
2   2  2  2
3   3  3  3
4   4  4  4
5   5  5  5
6  NA  6  6
7   7  7  7
8   8  8  8
9   9  9  9
10 10 10 10
> data <- data.frame(a,b,c)
> data.frame(t(apply(data,1,cumsum)))
    a  b  c
1   1  2  3
2   2  4  6
3   3  6  9
4   4  8 12
5   5 10 15
6  NA NA NA
7   7 14 21
8   8 16 24
9   9 18 27
10 10 20 30

My desired result is

    a  b  c
1   1  2  3
2   2  4  6
3   3  6  9
4   4  8 12
5   5 10 15
6   0  6 12
7   7 14 21
8   8 16 24
9   9 18 27
10 10 20 30

or

    a  b  c
1   1  2  3
2   2  4  6
3   3  6  9
4   4  8 12
5   5 10 15
6   NA  6 12
7   7 14 21
8   8 16 24
9   9 18 27
10 10 20 30

I am not sure apply(..., cumsum) is a good option, you may provide alternative method.

Was it helpful?

Solution 2

Given your desired result (where you don't mind NA becoming 0), I guess the easiest thing is to first remove the NA values using is.na and then carry on as before.

data[ is.na(data) ] <- 0
data.frame(t(apply(data,1,cumsum)))

OTHER TIPS

Simon's is definitely the simplest. I was surprised to learn a few things from this exercise: 1. cumsum doesn't have a na.rm argument 2. sum(NA, na.rm=TRUE) equals 0

Here is the code that brought me to the same solution:

cumsum.alt <- function(x){
    res <- NaN*seq(x)
    for(i in seq(x)){
        res[i] <- sum(x[1:i], na.rm=TRUE)
    }
    res
}

t(apply(data, 1, cumsum.alt))

To return the NA´s, a slight modification can be used:

cumsum.alt <- function(x){
    res <- NaN*seq(x)
    for(i in seq(x)){
        if(sum(is.na(x[1])) == i){
            res[i] <- NaN
        } else {
            res[i] <- sum(x[1:i], na.rm=TRUE)
        }
    }
    res
}

t(apply(data, 1, cumsum.alt))
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top