I have an imported excel file, DATASET looks like:

    Family       Weight
    1             150
    1             210   
    1             99
    2             230
    2             100
    2             172

I need to find the sum of ranks for each family.
I know that I can do this easily using PROC RANK but this is a HW problem and the only PROC statement I can use is PROC Means. I cannot even use Proc Sort.

The ranking would be as follows (lowest weight receives rank = 1, etc)

99 - Rank = 1
100 - Rank = 2
150 - Rank = 3
172 - Rank = 4
210 - Rank = 5
230 - Rank = 6

Resulting Dataset:

    Family       Sum_Ranking
    1             9
    2             12

Family 1 Sum_Ranking was calculated by (3+5+1)
Family 2 Sum_Ranking was calculated by (6+2+4)

Thank you for assistance.

有帮助吗?

解决方案

I'm not going to give you code, but some tips.

Specifically, the most interesting part about the instructions is the explicit "not even PROC SORT".

PROC MEANS has a useful side effect, in that it sorts data by the class variables (in the class variable order). So,

PROC SORT data=blah out=blah_w;
by x y;
run;

and

PROC MEANS data=blah;
class x y;
var y;
output out=blah_w n=;
run;

Have almost the identical results. Both produce a dataset sorted by x y, even though PROC MEANS didn't require a sort.

So in this case, you can use PROC MEANS' class statement to produce a dataset that is sorted by weight and family (you must carry over family here even though you don't need it). Then you must use a data step to produce a RANK variable, which is the rank of the current line (use the _FREQ_ column to figure that out in case there are more than one with the same rank in the same family, and think about what to do in case of ties), then another PROC MEANS to summarize by family this time.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top