How to include missing observations in R data frame with time dimension based on a priori info?

StackOverflow https://stackoverflow.com/questions/19677217

  •  01-07-2022
  •  | 
  •  

Question

I have an unbalanced panel data which I want to manipulate in order to include some a priori info in it. To do that, I need to do some data manipulation conditioned by the data frames time dimension. The original data looks somewhat like this:

            FIRM_ID  YEAR  CAP_START   CAP_END
OBS1        1        2000  CAP_S_2000  CAP_E_2000
OBS2        1        2001  CAP_S_2001  CAP_E_2001
OBS3        1        2002  NA          NA

I know that CAP_START on row OBS3 would be equal to CAP_END at previous year. How can I include this a priori info in my data?

Était-ce utile?

La solution

If your data is already sorted (and preferably in a character format, factor could cause problems with the levels) , then you could use something like this

#add preceding row cap_end entry:
dt$prev_CAP_END <- c(NA,head(dt$CAP_END,-1))
#wherever missing, update with prevcap_end
dt[which(is.na(dt$CAP_START)),"CAP_START"] <- dt[which(is.na(dt$CAP_START)),"prev_CAP_END"]
Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top