This is how my time-series, cross-sectional data is structured:
country year group change
Afghanistan 1980 1 0
Afghanistan 1981 1 0
Afghanistan 1982 1 1
Afghanistan 1983 1 0
Afghanistan 1984 1 0
Afghanistan 1985 1 1
Afghanistan 1986 1 0
Afghanistan 1987 1 2
Afghanistan 1988 1 0
Bhutan 1980 2 0
Bhutan 1981 2 0
Bhutan 1982 2 0
Bhutan 1983 2 0
Bhutan 1984 2 1
Bhutan 1985 2 0
Bhutan 1986 2 0
Bhutan 1987 2 0
Bhutan 1988 2 2
Chile 1980 3 0
The variable change is "1" if there was a positive change and "2" if there was a negative change.
PROBLEM
I am struggling with creating two new variables:
(1) A variable called "trend"
In lay terms this variable should stand for "For each group (country-year), trend = 1 if change = 1 but only until change = 2".
(2) A variable called "time"
This variable should specify the years before and after a positive trend (change =1 ).
That is, in the end, the data set should look like:
country year group change trend time
Afghanistan 1980 1 0 0 -2
Afghanistan 1981 1 0 0 -1
Afghanistan 1982 1 1 1 1
Afghanistan 1983 1 0 1 2
Afghanistan 1984 1 0 1 3
Afghanistan 1985 1 1 1 4
Afghanistan 1986 1 0 1 5
Afghanistan 1987 1 2 0 0
Afghanistan 1988 1 0 0 0
Bhutan 1980 2 0 0 -4
Bhutan 1981 2 0 0 -3
Bhutan 1982 2 0 0 -2
Bhutan 1983 2 0 0 -1
Bhutan 1984 2 1 1 1
Bhutan 1985 2 0 1 2
Bhutan 1986 2 0 1 3
Bhutan 1987 2 0 1 4
Bhutan 1988 2 2 0 0
Chile 1980 3 0 0 0
I think to separate the groups one could use "split", e.g.
data$trend <- split(data$group, data$group) # separate by unique values
[...]
data$trend <- unsplit(data$trend, data$group) # make back into a vector
BUT: What would be the command between these two lines?
This line would generate a sequence
data.time$trend <- lapply(data.time$trend, seq)
BUT: How to limit it to the positive trend, i.e. data$trend==1?
Any ideas more than welcome! Many thanks.