Here is a solution using the rolling join feature in data.table
. I have slightly changed (fixed?) your definition of a
and removed the Event
column in b1
.
require(data.table)
Start.Year <- c(1990, 1992, 1997, 1995)
End.Year <- c(1995, 1993, 2000, 1996)
Country <- c("A", "B", "A", "C")
a <- data.frame(Start.Year, End.Year, Country)
a <- data.table(a) ## convert to use feature
b1 <-as.data.frame(expand.grid(year=(1990:2000), Country=unique(a$Country)))
b1 <- data.table(b1) ## convert
## join by Start.Year, setting matching keys for each dataset
setkey(a, Country, Start.Year)
setkey(b1, Country, year)
# the tricky part
# roll=TRUE means all years will match to
# next smallest event Start.Year
ab <- a[b1, roll=TRUE]
setnames(ab, c('Country', 'Year', 'Event')) ## fix names
ab[Year > Event, Event:=NA] ## stop index at end year
ab[!is.na(Event), Event:=1] ## transform year markers to 1
ab[is.na(Event), Event:=0] ## transform missing matches to 0
ab
is the data in the format you want. You can use it just like a data.frame
or convert it back if you don't want to keep it in that class. The join should be very fast.