UPDATED
Here is a macro-based solution, with new step-by-step comments added. It uses metadata from the SAS dictionary.columns
to discover all numeric variables in a dataset. Basically, I take the MIN
, MEDIAN
, and MAX
of all the numeric variables, outputting the results in three separate datasets. I then concatenate the datasets, using the IN
variable to figure out where each row is coming from and thus labeling it with the appropriate statistic name. The output is then three rows and n
columns.
As the OP demonstrated in his answer, the whole macro / meta-data thing to get the numeric variables can all be replaced by simply using the special _NUMERIC_
variable. I will leave my current approach in place in case someone is interested in using it for other things.
Furthermore, the OP's answer is a macro-free solution that uses PROC TRANSPOSE
to get to the same place as this one, without any concatenation of separate result sets. I urge all readers to review it as it is more "SAS-like".
%GLOBAL
var_names
dsn_temp_min
dsn_temp_median
dsn_temp_max
;
%LET dsn_temp_min = min_summary ;
%LET dsn_temp_median= med_summary;
%LET dsn_temp_max= max_summary;
/* Identify dataset */
%LET lib_name = WORK ; /* change to your library */
%LET dsn = my_data ;
/* Retrieve numeric variable names from SAS metadata and store in `var_name` */
/* macro variable. Library and dataset name must be upper-case since that is */
/* how they are stored in `dictionary.columns`. */
/* UPDATE: this all can be avoided by just using the _NUMERIC_ special variable */
/* but I am leaving this in here in case anyone is interested in querying */
/* meta-data for other purposes. */
%LET lib_name = %UPCASE (&lib_name);
%LET dsn = %UPCASE (&dsn);
PROC SQL NOPRINT;
SELECT name
INTO :var_names SEPARATED BY ' '
FROM dictionary.columns
WHERE libname = "&lib_name"
AND memname = "&dsn"
AND type ^= "char"
;
QUIT;
RUN;
/* Take the MIN of all numeric variables and store in a separate dataset */
PROC MEANS DATA = &lib_name..&dsn NOPRINT ;
OUTPUT OUT=&dsn_temp_min (DROP = _TYPE_ _FREQ_)
MIN (&var_names) =
;
RUN;
/* Take the MEDIAN of all numeric variables and store in a separate dataset */
PROC MEANS DATA = &lib_name..&dsn NOPRINT ;
OUTPUT OUT=&dsn_temp_median (DROP = _TYPE_ _FREQ_)
MEDIAN (&var_names) =
;
RUN;
/* Take the MAX of all numeric variables and store in a separate dataset */
PROC MEANS DATA = &lib_name..&dsn NOPRINT ;
OUTPUT OUT=&dsn_temp_max (DROP = _TYPE_ _FREQ_)
MAX (&var_names) =
;
RUN;
/* Concatenate the three separate datasets into one. Use IN to figure out */
/* where each row is coming from, and label appropriately */
DATA summary_data;
LENGTH stat $6 ;
RETAIN
stat &var_names
;
SET
&dsn_temp_min (IN=s1)
&dsn_temp_median (IN=s2)
&dsn_temp_max (IN=s3)
;
IF (s1) THEN DO;
stat = "MIN" ;
END;
ELSE IF (s2) THEN DO;
stat = "MEDIAN" ;
END;
ELSE IF (s3) THEN DO;
stat = "MAX" ;
END;
LABEL stat = "Statistic";
RUN;