Question

I've come in late to a project and want to write a macro that normalises some data for export to a SQL Server.

There are two control tables...
- Table 1 (customers) has a list of customer unique identifiers
- Table 2 (hierarchy) has a list of table names

There are then n additional tables. One for each record in (hierarchy) (named in the SourceTableName field). With the form of...
- CustomerURN, Value1, Value2

I want to combine all of these tables into a single table (sample_results), with the form of...
- SourceTableName, CustomerURN, Value1, Value2

The only records that should be copied, however, should be for CustomerURNs that exist in the (customers) table.


I could do this in a hard coded format using proc sql, something like...

proc sql;
insert into
  SAMPLE_RESULTS
select
  'TABLE1',
  data.*
from 
  Table1    data
INNER JOIN
  customers
    ON data.CustomerURN = customers.CustomerURN

<repeat for every table>

But every week new records are added to the hierarchy table.

Is there any way to write a loop that picks up the table name from the hierarchy table, then calls the proc sql to copy the data into sample_results?

Was it helpful?

Solution

You could concatenate all the hierarchy tables together, and do a single SQL join

proc sql ;
  drop table all_hier_tables ;
quit ;

    %MACRO FLAG_APPEND(DSN) ;
      /* Create new var with tablename */
      data &DSN._b ;
        length SourceTableName $32. ;
        SourceTableName = "&DSN" ;
        set &DSN ;
      run ;

      /* Append to master */
      proc append data=&DSN._b base=all_hier_tables force ; 
      run ;
    %MEND ;

    /* Append all hierarchy tables together */
    data _null_ ;
      set hierarchy ;
      code = cats('%FLAG_APPEND(' , SourceTableName , ');') ;
      call execute(code); /* run the macro */
    run ;

    /* Now merge in... */
    proc sql;
    insert into
      SAMPLE_RESULTS
    select
      data.*
    from 
      all_hier_tables data
    INNER JOIN
      customers
        ON data.CustomerURN = customers.CustomerURN
quit;

OTHER TIPS

Another way is to create a view so that it will always reflect the latest data in the metadata tables. The call execute function is used to read in the table names from the hierarchy dataset. Here is an example which you should be able to modify to suit your data, the last bit of code is the relevant one to you.

data class1 class2 class3;
set sashelp.class;
run;

data hierarchy;
input table_name $;
cards;
class1
class2
class3
;
run;

data ages;
input age;
cards;
11
13
15
;
run;

data _null_;
set hierarchy end=last;
if _n_=1 then call execute('proc sql; create view sample_results_view as ' );
if not last then call execute('select * from '||trim(table_name)||' where age in (select age from ages) union all ');
if last then call execute('select * from '||trim(table_name)||' where age in (select age from ages); quit;');
run;
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top