Question

I'm trying to transpose a data using values as variable names and summarize numeric data by group, I tried with proc transpose and with proc report (across) but I can't do this, the unique way that I know to do this is with data set (if else and sum but the changes aren't dynamically)

For example I have this data set:

school name   subject      picked  saving   expenses
raget  John   math         10      10500    3500
raget  John   spanish      5       1200     2000
raget  Ruby   nosubject    10      5000     1000
raget  Ruby   nosubject    2       3000      0
raget  Ruby   math         3       2000     500
raget  peter  geography    2       1000      0
raget  noname nosubject    0        0       1200

and I need this in 1 line, sum of 'picked' by the names of students, and later sum of picked by subject, the last 3 columns is the sum total for picked, saving and expense:

school john ruby peter noname math spanish geography nosubject  picked saving expenses 
raget   15   15   2      0     13   5        2         12        32     22700  8200

If it's possible to be dynamically changed if I have a new student in the school or subject?

Was it helpful?

Solution

It's a little difficult because you're summarising at more than one level, so I've used PROC SUMMARY and chosen different _TYPE_ values. See below:

data have;
  infile datalines;  
  input school $ name $ subject : $10. picked saving expenses;
  datalines;
raget  John   math         10      10500    3500
raget  John   spanish      5       1200     2000
raget  Ruby   nosubject    10      5000     1000
raget  Ruby   nosubject    2       3000      0
raget  Ruby   math         3       2000     500
raget  peter  geography    2       1000      0
raget  noname nosubject    0        0       1200
;
run;

proc summary data=have;
  class school name subject;
  var picked saving expenses;
  output out=want1 sum(picked)=picked sum(saving)=saving sum(expenses)=expenses;
run;

proc transpose data=want1 (where=(_type_=5)) out=subs (where=(_NAME_='picked'));
  by school;
  id subject;
run;

proc transpose data=want1 (where=(_type_=6)) out=names (where=(_NAME_='picked'));
  by school;
  id name;
run;

proc sql;
  create table want (drop=_TYPE_ _FREQ_ name subject) as
  select
  n.*,
  s.*,
  w.*
  from want1 (where=(_TYPE_=4)) w,
  names (drop=_NAME_) n,
  subs (drop=_NAME_) s
  where w.school = n.school
  and w.school = s.school;
quit;

I've also tested this code by adding new schools, names and subjects and they do appear in the final table. You'll note that I haven't hardcoded anything (e.g. no reference to math or John), so the code is dynamic enough.

OTHER TIPS

PROC REPORT is an interesting alternative, particularly if you want the printed output rather than as a dataset. You can use ODS OUTPUT to get the output dataset, but it's messy as the variable names aren't defined for some reason (they're "C2" etc.). The printed output of this one is a little messy also as the header rows don't line up, but that can be fixed with some finagling if that's desired.

data have;
input school $ name   $ subject      $ picked  saving   expenses;
datalines;
raget  John   math         10      10500    3500
raget  John   spanish      5       1200     2000
raget  Ruby   nosubject    10      5000     1000
raget  Ruby   nosubject    2       3000      0
raget  Ruby   math         3       2000     500
raget  peter  geography    2       1000      0
raget  noname nosubject    0        0       1200
;;;;
run;

ods output report=want;
proc report nowd data=have;
columns school (name subject),(picked) picked=picked2 saving expenses;
define picked/analysis sum ' ';
define picked2/analysis sum;
define saving/analysis sum ;
define expenses/analysis sum;
define name/across;
define subject/across;
define school/group;
run;
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top