문제

I have the following dataframe dat, which presents a row-specific number of NAs at the beginning of some of its rows:

dat <- as.data.frame(rbind(c(NA,NA,1,3,5,NA,NA,NA), c(NA,1:3,6:8,NA), c(1:7,NA)))
dat

#  V1 V2 V3 V4 V5 V6 V7 V8
#  NA NA  1  3  5 NA NA NA
#  NA  1  2  3  6  7  8 NA
#   1 NA  2  3  4  5  6 NA

My aim is to delete all the NAs at the beginning of each row and to left shift the row values (adding NAs at the end of the shifted rows accordingly, in order to keep their length constant).

The following code works as expected:

for (i in 1:nrow(dat)) {

    if (is.na(dat[i,1])==TRUE) {
        dat1 <- dat[i, min(which(!is.na(dat[i,]))):length(dat[i,])]
        dat[i,]  <- data.frame( dat1, t(rep(NA, ncol(dat)-length(dat1))) )
    }

}

dat

returning:

#  V1 V2 V3 V4 V5 V6 V7 V8
#   1  3  5 NA NA NA NA NA
#   1  2  3  6  7  8 NA NA
#   1 NA  2  3  4  5  6 NA

I was wondering whther there is a more direct way to do so without using a for-loop and by using the tail function.

With respect to this last point, by using min(which(!is.na(dat[1,]))) the result is 3, as expected. But then if I type tail(dat[1,],min(which(!is.na(dat[1,])))) the result is the same initial row, and I don't understand why..

Thank you very much for anu suggestion.

도움이 되었습니까?

해결책 2

I don't think you can do this without a loop.

dat <- as.data.frame(rbind(c(NA,NA,1,3,5,NA,NA,NA), c(NA,1:3,6:8,NA), c(1:7,NA)))
dat[3,2] <- NA

#   V1 V2 V3 V4 V5 V6 V7 V8
# 1 NA NA  1  3  5 NA NA NA
# 2 NA  1  2  3  6  7  8 NA
# 3  1 NA  3  4  5  6  7 NA

t(apply(dat, 1, function(x) {
  if (is.na(x[1])) {
    y <- x[-seq_len(which.min(is.na(x))-1)]
    length(y) <- length(x)
    y
  } else x
}))

#     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
#[1,]    1    3    5   NA   NA   NA   NA   NA
#[2,]    1    2    3    6    7    8   NA   NA
#[3,]    1   NA    3    4    5    6    7   NA

Then turn the matrix into a data.frame if you must.

다른 팁

if you just want all NA's to be pushed to the end, you could try

dat <- as.data.frame(rbind(c(NA,NA,1,3,5,NA,NA,NA), c(NA,1:3,6:8,NA), c(1:7,NA)))
dat[3,2] <- NA
> dat
  V1 V2 V3 V4 V5 V6 V7 V8
1 NA NA  1  3  5 NA NA NA
2 NA  1  2  3  6  7  8 NA
3  1 NA  3  4  5  6  7 NA
dat.new<-do.call(rbind,lapply(1:nrow(dat),function(x) t(matrix(dat[x,order(is.na(dat[x,]))])) ))
colnames(dat.new)<-colnames(dat)
> dat.new
     V1 V2 V3 V4 V5 V6 V7 V8
[1,] 1  3  5  NA NA NA NA NA
[2,] 1  2  3  6  7  8  NA NA
[3,] 1  3  4  5  6  7  NA NA

Here there is the answer by using the tail function:

dat <- as.data.frame(rbind(c(NA,NA,1,3,5,NA,NA,NA), c(NA,1:3,6:8,NA), c(1:7,NA)))
dat

        for (i in 1:nrow(dat)) {

            if (is.na(dat[i,1])==TRUE) {

              # drops initial NAs of the row (if the sequence starts with NAs)
                dat1 <- tail(as.integer(dat[i,]), -min(which(!is.na(dat[i,]))-1))

              # adds final NAs to keep the row length constant (i.e. conformable with 'dat')
                length(dat1) <- ncol(dat) 

              dat[i,] <- dat1

            }

        }

dat
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top