Pregunta

Title might be misleading.

I have a longitudinal dataset with a dummy (dummy1) variable indicating if a condition is met in a certain year, for given category. I want this event to be taken into account for the next twenty years as well. Hence, I want to create a new dummy (dummy2), which takes the value 1 for the 19 observations following the observation where dummy1 was 1, as well as that same observation (example below).

example

I was trying to create a loop with lag operators, but failed to get it to work so far.

¿Fue útil?

Solución

Even code that failed might be close to a good solution. Not giving code that failed means that we can't explain your mistakes. Furthermore, questions focusing on how to use software to do something are widely considered marginal or off-topic on SO.

One approach is

bysort category (year) : gen previous = year if dummy1 
by category : replace previous = previous[_n-1] if missing(previous)
gen byte dummy2 = (year - previous) < 20 

The trick here is to create a variable holding the last year that the dummy (indicator) was 1, and the trick in that is spelled out in How can I replace missing values with previous or following nonmissing values or within sequences?

Note that this works independently of

  1. whether the panel identifier is numeric (it could be string here, on the evidence given)

  2. whether you have tsset or xtset the data

  3. what happens before the first event; for such years, previous is born missing and remains missing (however, in general, watch for problems with code at the ends of time series).

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top