Question

I am looking to create a variable (COUNT_WITHIN_2_DAYS) that is the count of observations in a date range. The range would be the stated date +/- 2 days.

 ID#          DATE          COUNT_WITHIN_2_DAYS
  1         1/1/2000                3
  2         1/2/2000                4
  5         1/2/2000                4
  6         1/4/2000                3
  9        1/12/2000                1
 11         3/1/2001                2
 12         3/3/2000                4
 16         3/4/2000                3
 18         3/5/2000                3
 21         3/8/2000                1

I tried an egen with an if command but that did not work

Was it helpful?

Solution

This solution involves looping over all observations and uses the fact that you can compute with dates as you would with integers, after giving the date variable a date format. (These dates are just integers. See e.g. help datetime for details).

Every observation is compared with the one observation that is fixed by the loop. ind == 1 for those observations in which the absolute difference is <= 2. Afterwards, I use summarize to add-up the number of such observations. Finally, the result of the sum is assigned to the fixed observation using the variable counter. The procedure is repeated until each observation has been fixed.

clear all
set more off

*-------------- example data ----------------
input ///
id str15 date 
  1         "1/1/2000"                
  2         "1/2/2000"                
  5         "1/2/2000"                
  6         "1/4/2000"                
  9        "1/12/2000"                
 11         "3/1/2001"                
 12         "3/3/2000"                
 16         "3/4/2000"                
 18         "3/5/2000"                
 21         "3/8/2000"               
 end

gen date2 = date(date, "MDY")
format date2 %td

drop date
rename date2 date

sort date
list

*------------ what you want ------------------

gen counter = .
gen ind = .
forvalues i = 1/`=_N' {
    replace ind = abs(date - date[`i']) <= 2

    summarize ind, meanonly
    replace counter = r(sum) in `i'
}

drop ind
list
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top