Question

Suppose I have a directed graph represented in a dataset named links, which has two variables: from_id and to_id. I want to use SAS Data Step to do two things: (1) count the number of nodes, and (2) count the number of edges.

Suppose the links dataset is as shown below.

from_id    to_id
----------------
   1         2
   2         3
   3         1
   3         2

In this example, there are 3 nodes and 4 edges. (We can assume there are no duplicate edges in links). The nodes are 1, 2, and 3. The edges are 1->2, 2->3, 3->1, and 3->2.

Below is a SAS Macro that uses SAS Data Step in conjunction with proc sql in order to count the nodes and edges. It works perfectly, but I wish to use SAS Data Step so that counting the nodes and edges may (potentially) be done faster.

/* display number of nodes and edges for graph */
%macro graph_info(links);
data nodes;
    set &links;
    node_id = from_id;
    output;
    node_id = to_id;
    output;
    keep node_id;
run;

proc sql noprint;
    select count(distinct node_id) into :numNodes from nodes;
quit;
proc datasets lib=work nolist;
    delete nodes;
quit;

proc sql noprint;
    select count(*) into :numEdges from &links;
quit;

%put Nodes: &numNodes;
%put Edges: &numEdges;
%mend;
Was it helpful?

Solution

If you have enough memory, you may be able to do this with a hash object.

Be warned: this code is untested, as I don't have a SAS installation to hand. However the basic idea should work. You iterate through the data step, adding each node to the hash object, and on the last object you set macro variables to the size of the hash object.

data _null_;
  set links end=lastrec;
  format node_id 8.;
  if _N_ eq 1 then do;
    declare hash h();
    h.defineKey("node_id");
    h.defineDone();
  end;
  node_id = from_id;
  rc=h.find();
  if rc ne 0 then h.add();
  node_id = to_id;
  rc=h.find();
  if rc ne 0 then h.add();
  if lastrec then do;
    call symput('numLinks', put(h.num_items, 8. -L));
    call symput('numEdges', put(_N_, 8. -L));
  end;
run;
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top