How can I create a trailing count for binary data in Stata?
-
20-12-2019 - |
Question
In Stata, I currently have a data set that looks like:
I am trying to create a "trailing counter" in column B so that it looks like:
Here, the counter starts at 1 and for every time a "1" appears in A, B adds on a value.
This seems to be very simple, but I am not sure how to do this exactly. Here is what I have done so far:
Assuming the column A is called "A" in Stata,
I use:
gen B = A + A[_n - 1]
But, this gives me something off. I am not sure how to proceed, would anyone have any tips?
Solution
Here's one way:
clear all
set more off
*----- example data -----
input ///
var1
0
0
0
0
1
0
0
1
0
0
0
end
list, sep(0)
*----- what you want -----
gen counter = sum(var1) + 1
list, sep(0)
The sum()
function will give you a cumulative sum. See help sum()
. This is a very basic Stata function. A search sum
would have gotten you there quickly.
Your approach fails because you are only adding up, for each observation, the "current" value of A
with the previous value of itself. That might sound like a cumulative sum, but think about it and you will see that it isn't.
With your code and my data, the result would be:
+----------------+
| var1 counter |
|----------------|
1. | 0 . |
2. | 0 0 |
3. | 0 0 |
4. | 0 0 |
5. | 1 1 |
6. | 0 1 |
7. | 0 0 |
8. | 1 1 |
9. | 0 1 |
10. | 0 0 |
11. | 0 0 |
+----------------+
The first observation for counter
is missing (.
). That is because there's no previous value for the first observation of var1
, so Stata does something like var1[1] + var1[0] = 0 + . = .
.
The second observation for counter
is var1[2] + var1[1] = 0 + 0 = 0
.
The fifth observation for counter
is var1[5] + var1[4] = 1 + 0 = 1
.
The seventh observation for counter
is var1[7] + var1[6] = 0 + 0 = 0
. And so on.