Changing values in previous and post records when a numerical condition is met using SAS

StackOverflow https://stackoverflow.com/questions/23352274

  •  11-07-2023
  •  | 
  •  

Question

data have;
input patient level timepoint;
datalines;
1   0   1
1   0   2
1   0   3
1   3   4
1   0   5
1   0   6
2   0   1
2   4   2
2   0   3
2   3   4
2   0   5
2   0   6
2   0   7
2   2   8
2   0   9
2   0   10
3   3   1
3   0   2
3   0   3
4   0   1
4   0   2
4   0   3
4   0   4
4   1   5
4   0   6
4   0   7
4   0   8
4   0   9
4   0   10

;;
proc print; run;

/* Condition 1: If there is one non-zero numeric value, in level, sorted by timepoint for a patient, set level to 2.5 for the record that is immediately prior to this time point; and set level = 1.5 for the next prior time point; set level to 2.5 for the record that is immediate post this time point; and set level to 1.5 for the next post record. The levels by timepoint should look like, ... 1.5, 2.5, non-zero numeric value, 2.5, 1.5 ... (Note: ... are kept as 0s).

Condition 2: If there are two or more non-zero numeric values, in level, sorted by timepoint for a patient, find the FIRST non-zero numeric value, and set level to 2.5 for the record that is immediate prior this time point; and set level to 1.5 for the next prior time point; then find the LAST non-zero numeric value record, set level to 2.5 for the record that is immediate post this last non-zero numeric value, and set level to 1.5 for the next post record; Set all zero values (i.e. level=0) to level = 2.5 for records between the first and last non-zero numeric values; The levels by timepoint should look like: ... 1.5, 2.5, FIRST Non-zero Numeric value, 2.5, Non-zero Numeric value, 2.5, LAST Non-zero Numeric value, 2.5, 1.5 .... */

I've tried data steps using N-1, N-2, N+1, N+2, arrays/do loops (my first thought was to use multiple arrays for this so that I could use the i=index to go to previous i-1/i+1 or i-2/1+2 records, but it was hard to grasp the concept of how to even code it.). All of this has to be done BY Patient, so there may be instances where there is only one record before the first non-zero and not two. The same could be true for post record as well. I searched all different types of examples and help, but none that could help with my needs. Thanks in advance for any help.

This is how I want the data to look like:

    data want;
    input patient level timepoint;
datalines;
1   0   1
1   1.5 2
1   2.5 3
1   3   4
1   2.5 5
1   1.5 6
2   2.5 1
2   4   2
2   2.5 3
2   3   4
2   2.5 5
2   2.5 6
2   2.5 7
2   2   8
2   2.5 9
2   1.5 10
3   3   1
3   2.5 2
3   1.5 3
4   0   1
4   0   2
4   1.5 3
4   2.5 4
4   1   5
4   2.5 6
4   1.5 7
4   0   8
4   0   9
4   0   10

;;
proc print; run;
Was it helpful?

Solution

I approached this by first finding the timepoints of the first and last non-zero levels. Then I merged those into the original set, and changed levels based on the rules you mentioned.

proc sort data = have;
    by patient timepoint;
run;

data have2;
    retain first 0 last 0;
    set have;
    by patient timepoint;
    if level ne 0 and first = 0 then first = timepoint;
    if level ne 0 then last = timepoint;
    if last.patient then do;
        output;
        first = 0;
        last = 0;
    end;
    keep patient first last;
run;

proc sort data=have2;
    by patient;
run;

data merged;
    merge have have2;
    by patient;
    if level = 0 then do;
        if first-timepoint = 1 then level = 2.5;
        if first-timepoint = 2 then level = 1.5;
        if last-timepoint = -1 then level = 2.5;
        if last-timepoint = -2 then level = 1.5;
        if first < timepoint < last then level = 2.5;
    end;
    drop first last;
run;
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top