Question

I have a SAS program that dynamically builds a table with a macro like so:

%macro Projection;
%do i=1 %to &number_of_Years;
    %Advance_Years;
     proc sql;
      create table Projection as 
      select *, Year_&previous_year.*(1+return) as Year_&current_year.
      from Projection;
      quit;
%end;
%Mend Projection;

%Projection;

This is a simplified version of my code. The %Advance_Years macro basically advances &current_year and &previous_year macro of 1 year. As you can see this table acquires 1 variable per year. The problem is that the number of rows for this table can reach hundreds of thousands and I've seen the execution time skyrocket, reaching hours to complete.

I've tried option compress=yes and it helped reduce execution time but not a lot. I've tried most of the SAS tips and tricks for speeding up execution but again, not much of a difference. I'm running Base SAS 9.2 on a 32b.

I think I have a wrong take on the variable adding technique. Does the fact that I overwrite my table on each loop have an effect on execution efficiency ? If yes how could I rewrite this to be the MOST efficient code possible ? Please bear in mind that I cannot 'transpose' the table and just add more rows. Thanks in advance!

Was it helpful?

Solution

There's no reason to do this in more than one pass through the data, unless I'm missing something significant here. That is certainly the problem, as you surmise.

data projection;
set <whatever came before projection>;
array years year_1-year_&number_of_years.;
year_1=1;
do _t = 2 to dim(years);
 years[_t] = years[_t-1]*(1+return);
end;
run;

The SQL solution wouldn't be terribly different, except that without arrays you would have to build it dynamically through the macro language - but still in one ... from projection; pass.

%macro add_to_sql(current_year=,previous_year=);
Year_&previous_year.*(1+return) as Year_&current_year.
%mend add_to_sql;

data calllist;
do current_year = 2 to &number_of_years;
 previous_year=current_year-1;
 output;
end;
run;

proc sql;
select cats('%add_to_sql(current_year=',current_year,',previous_year=',previous_year,')')
into :addlist separated by ',' from callist;
select year_1,&addlist from projection;
quit;

A few notes: You should try to drive your program execution through data whenever possible. Although this appears slightly less efficient, it's much easier to read and debug than having a macro to increment some macro variables, which additionally violate SAS's version of protected/private/whatever object oriented whatnot - which SAS doesn't object to, but is really bad style. Call macros with parameters, and provide these from the calling environment; a macro should not alter macro variables in the calling environment. Hence I have add_to_sql set up with parameters, and a dataset that contains those parameters (could be a dataset that is imported from elsewhere, or created as in here). &number_of_years should also be an entry parameter to this module, unless this is the primary program, in which case a global variable is okay.

In general, in SAS (and in other languages, but especially in SAS), the I/O time will trump any other processing time, in particular given the current economics of computing (CPUs are super-powerful, many orders of magnitude faster than 10 years ago, while unless you are on flash storage, your storage speed is less than an order of magnitude faster than 10 years ago).

OTHER TIPS

I imagine that writing over the same table is causing a lot of performance problems.

Let me describe the solution that I think you should take. My SQS macro programming skills are a bit rusty, so I'll leave the actual implementation to you.

First, alter the Projection table to have the columns you want to add. Ideally, you would do this in one statement. But you can do something like:

%do i=1 %to &number_of_Years;
    %Advance_Years;
    alter table Projection
        add Year_&current_year double;
%end;

Then, loop through and do the calculations in place:

%do i=1 %to &number_of_Years;
    %Advance_Years;
    update table Projection
        set Year_&current_year = Year_&previous_year.*(1+return);
%end;
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top