質問

I have a dataset with from and to dates of registration for a group of users. I would like to programmatically find which months lie in between those dates for each user, without having to hard code in any months, etc. I only want a summary of numbers registered in each month, so if that makes it quicker, so much the better.

E.g. I have something like

User-+-From-------+-To-----------------
A    + 11JAN2011  + 15MAR2011
A    + 16JUN2011  + 17AUG2011
B    + 10FEB2011  + 12FEB2011
C    + 01AUG2011  + 05AUG2011

And I want something like

Month---+-Registrations
JAN2011 + 1 (A)
FEB2011 + 2 (AB)
MAR2011 + 1 (A)
APR2011 + 0
MAY2011 + 0
JUN2011 + 1 (A)
JUL2011 + 1 (A)
AUG2011 + 2 (AC)

Note I don't need the bit in brackets; that was just to try and clarify my point.

Thanks for any help.

役に立ちましたか?

解決

One easy way is to construct an intermediate dataset and then PROC FREQ.

data have;
informat from to DATE9.;
format from to DATE9.;
input user $ from to;
datalines;
A     11JAN2011   15MAR2011
A     16JUN2011   17AUG2011
B     10FEB2011   12FEB2011
C     01AUG2011   05AUG2011
;;;;
run;

data int;
set have;
_mths=intck('month',from,to,'d');  *number of months after the current one (0=current one). 'd'=discrete=count 1st of month as new month;
do _i = 0 to _mths; *start with current month, iterate over months;
  month = intnx('month',from,_i,'b');
  output;
end;
format month MONYY7.;
run;

proc freq data=int;
tables month/out=want(keep=month count rename=count=registrations);
run;

You can eliminate the _mths step by doing that in the do loop.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top