Question

I want to perform some regression and i would like to count the number of nonmissing observation for each variable. But i don't know yet which variable i will use. I've come up with the following solution which does not work. Any help?

Here basically I put each one of my explanatory variable in variable. For example var1 var 2 -> w1 = var1, w2= var2.
Notice that i don't know how many variable i have in advance so i leave room for ten variables.
Then store the potential variable using symput.

data _null_;
cntw=countw(&parameters);
i = 1;
array w{10} $15.;
do while(i <= cntw);
w[i]= scan((&parameters"),i, ' ');
i = i +1;
end;
/* store a variable globally*/
do j=1 to 10;
call symput("explanVar"||left(put(j,3.)), w(j));
end;
run;

My next step is to perform a proc sql using the variable i've stored. It does not work as if I have less than 10 variables.

proc sql;
select count(&explanVar1), count(&explanVar2), 
 count(&explanVar3), count(&explanVar4), 
  count(&explanVar5), count(&explanVar6),
   count(&explanVar7), count(&explanVar8),
    count(&explanVar9), count(&explanVar10)
from estimation
;quit;

Can this code work with less than 10 variables?

Was it helpful?

Solution

You haven't provided the full context for this project, so it's unclear if this will work for you - but I think this is what I'd do.

First off, you're in SAS, use SAS where it's best - counting things. Instead of the PROC SQL and the data step, use PROC MEANS:

proc means data=estimation n;
var &parameters.;
run;

That, without any extra work, gets you the number of nonmissing values for all of your variables in one nice table.

Secondly, if there is a reason to do the PROC SQL, it's probably a bit more logical to structure it this way.

proc sql;
select 
%do i = 1 %to %sysfunc(countw(&parameters.));
  count(%scan(&parameters.,&i.) ) as Parameter_&i.,  /* or could reuse the %scan result to name this better*/
%end; count(1)  as Total_Obs
from estimation;
quit;

The final Total Obs column is useful to simplify the code (dealing with the extra comma is mildly annoying). You could also put it at the start and prepend the commas.

You finally could also drive this from a dataset rather than a macro variable. I like that better, in general, as it's easier to deal with in a lot of ways. If your parameter list is in a data set somewhere (one parameter per row, in the dataset "Parameters", with "var" as the name of the column containing the parameter), you could do

proc sql;
select cats('%countme(var=',var,')') into :countlist separated by ',' 
  from parameters;
quit;

%macro countme(var=);
count(&var.) as &var._count
%mend countme;

proc sql;
select &countlist from estimation;
quit;

This I like the best, as it is the simplest code and is very easy to modify. You could even drive it from a contents of estimation, if it's easy to determine what your potential parameters might be from that (or from dictionary.columns).

OTHER TIPS

I'm not sure about your SAS macro, but the SQL query will work with these two notes:

1) If you don't follow your COUNT() functions with an identifier such as "COUNT() AS VAR1", your results will not have field headings. If that's ok with you, then you may not need to worry about it. But if you export the data, it will be helpful for you if you name them by adding "...AS "MY_NAME".

2) For observations with fewer than 10 variables, the query will return NULL values. So don't worry about not getting all of the results with what you have, because as long as the table you're querying has space for 10 variables (10 separate fields), you will get data back.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top