سؤال

So I just received a dataset wherein one column of the data frame is "cycled." This column is actually a cycle of years (in my case, 1984-2007). In another column, there are corresponding dollar amounts (actually, "funding levels") for each of those years. My job is to create a lag variable for these funding levels. But here is the trick: each time the year cycle starts over, a new "variable" has begun. Thus, the lag variable I am looking for is not simply a shift backward of the entire funding column. Instead, I need to create a funding lag for each sub-cycle of the data. To be more concrete, my data looks a little bit like this:

    X Y
    1 7
    2 8
    3 9
    1 4
    2 6
    3 5
    1 2
    2 4
    3 3

And I need it to look like this:

    X Y
    1 NA
    2 7
    3 8
    1 NA
    2 4
    3 6
    1 NA
    2 2
    3 4

How would I go about doing this? Thank you so much for your help!

-JMC

هل كانت مفيدة؟

المحلول

This should work. (I often forget to name the FUN argument and ave then complains with a cryptic error message.)

  #Wrong dfrm$Y <- ave( dfrm$Y, dfrm$X, FUN=function(x) c(NA, x) )

Lacking a proper grouping factor to mark distinct categories of time sequences, I decided to cue off X==1:

dfrm$Y <- ave( dfrm$Y, cumsum(dfrm$X==1), FUN=function(x) c(NA, x[-length(x)]) )
مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top