Question

I have been using SAS off and on for a year and I'm finally getting into arrays, macros, and all that cool stuff.

What I want to do:

I have a merged dataset with data from students in different grades on a test. I need to create different files for each grade. I don't have a grade variable to easily sort the dataset by and create different files. I do have an index of variables specific to each grade.

Example - What I have:

+-------+--------+--------+--------+--------+--------+
|  ID   | sc_132 | sc_139 | sc_142 | sc_143 | sc_151 |
+-------+--------+--------+--------+--------+--------+
| 16623 | 1      | 1      | 0      | .      | .      |
| 16624 | 1      | 0      | 0      | .      | .      |
| 16626 | 1      | 1      | 1      | .      | .      |
| 17221 | .      | .      | .      | 1      | 0      |
| 17222 | .      | .      | .      | 0      | 1      |
| 17225 | .      | .      | .      | 0      | .      |
+-------+--------+--------+--------+--------+--------+

Example - What I want:

+-------+--------+--------+--------+--------+--------+
|  ID   | sc_132 | sc_139 | sc_142 | sc_143 | sc_151 |
+-------+--------+--------+--------+--------+--------+
| 16623 | 1      | 1      | 0      | .      | .      |
| 16624 | 1      | 0      | 0      | .      | .      |
| 16626 | 1      | 1      | 1      | .      | .      |
+-------+--------+--------+--------+--------+--------+
+-------+--------+--------+--------+--------+--------+
|  ID   | sc_132 | sc_139 | sc_142 | sc_143 | sc_151 |
+-------+--------+--------+--------+--------+--------+
| 17221 | .      | .      | .      | 1      | 0      |
| 17222 | .      | .      | .      | 0      | 1      |
| 17225 | .      | .      | .      | 0      | .      |
+-------+--------+--------+--------+--------+--------+

Where I am:

I have a lot of variables specific to each grade, and some of the variables contain missing data, so to be thorough I should check all of the grade-specific variables and output any observations containing data in any of those fields. I could use a hideously long IF THEN statement...

DATA grade1 grade2 grade3 grade4;
SET gradeall;
    IF sc_132 ^= . OR sc_139 ^= . OR (AND SO ON FOR ABOUT 34 VARIABLES) THEN OUTPUT grade1;
RUN;

But I thought this would be a good time to use an array. I can't find any easy to parse documentation about where and when you can use do loops. Using my logic of other programming languages and what I've browsed about do loops I've put together the following.

%let gr1_var = sc_132 sc_139 sc_142;
/*-GRADE SPECIFIC ARRAY REPEATED FOR OTHER GRADES -*/

DATA grade1 grade2 grade3 grade4;
SET gradeall;
PUT &gr1_var;
ARRAY grade1 [*] &gr1_var;
IF (
    DO i= 1 TO (DIM(items5_all)-1);
        items5_all(i) ^=. OR ;
    END;
    DO i= DIM(items5_all);
        items5_all(i) ^=.;
    END;
    )
THEN OUTPUT grade1;
    /*-IF THEN STATEMENT THEN REPEATED FOR OTHER GRADES-*/
run;

I was hoping this would give me the equivalent of the long IF THEN statement above without having to type it. But of course it is non-functional.

Can you even use do loops within If statements (I haven't found any examples of this)?
Does anyone have any recommendations for how to accomplish this task?

Was it helpful?

Solution

I think if you only want to output any observation which contains data in any of specific fields, you can just do a sum of array. If any observation doesn't have value for a variable, the sum is empty so this observation will not be output. No loop is needed. Just like:

 %let gr1_var = sc_132--sc_142; /*for array definition, you may use "--" or "-" */
 %let gr2_var = sc_143 sc_151;

 DATA grade1 grade2;
 SET gradeall;
 ARRAY grade1 [*] &gr1_var;
 ARRAY grade2 [*] &gr2_var;
 if sum(of grade1(*))^=. then output grade1;
 if sum(of grade2(*))^=. then output grade2;
 run;

By the way, if macro is used here, there is no need to write multiple if..then and array definition.

And I don't think you can use DO LOOP inside if..else statement like what you put here.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top