Frage

Background: When we test the significance of a categorical variable that has been coded as dummy variables, we need to simultaneously test all dummy variables are 0. For example, if X takes on values of 0, 1, 2, 3 and 4, I would fit dummy variables for levels 1-4 (assuming I want 0 to be baseline), then want to simultaneously test B1=B2=B3=B4=0.

If this is the only variable in my data set, I can use the overall F-statistic to achieve this. However, if I have other covariates, the overall F-test doesn't work.

In Stata, for example, this is (very, very) simply carried out by the testparm command as: testparm i.x (after fitting the desired regression model), where the i. prefix tells Stata X is a categorical data to be treated as dummy variables.

Question/issue: I'm wondering how I can do this in SAS with a CONTRAST (or ESTIMATE?) statement while fitting a regression model with PROC GLM. Since I have scoured the internet and haven't found what I'm looking for, I'm guessing I'm missing something very obvious. However, all of the examples I've seen are NOT for categorical (class) variables, but rather two separate (say continuous) variables. The contrast statement in that case would simply be something like

CONTRAST 'Contrast1' y 1 z 1;

Otherwise, they're for calculating hypotheses like H_0: B1-B2=0.

I feel like I need to breakdown the hypotheses into smaller pieces and determine that set that defines the whole relationship, but I'm not doing it correctly. For example, for B1=B2=B3=B4=0, I thought I might say B1=B2=B3=-B4, then define (1) B1=-B4, (2) B2=-B4 and (3) B2=B3. I was trying to code this as a CONTRAST statement as (say X is in descending order in data set: 4-0):

CONTRAST 'Contrast' x -1 0 0 1 0
                    x -1 0 1 0 0
                    x 0 1 1 0 0;

I know this is not correct, and I tried many, many variations and whatever random logic I could come up with. My problem is I have relatively novice-level knowledge of CONTRAST (and unfortunately have not found great documentation to help with this) and also of how this hypothesis test should really be formulated for the sake of estimation (do I try to split it up into pieces as I did above, or...?).

War es hilfreich?

Lösung

From my note above, you actually can get SAS to do this for you with PROC GENMOD and the CLASS statement and a TYPE3 specification.

proc genmod data=input;
class classvar ;
model slope= classvar othervar/  type3;
run;
quit;

In the example above, my class levels are in the classvar variable. The othervar is my other covariate.

At the end of the output, you see a table labeled LR Statistics For Type 3 Analysis. The row for classvar is the LR test of all the class effects=0.

Andere Tipps

Another case where PROC REG with TEST works (TEST x1=0, x2=0, x3=0, x4=0, e.g.), which isn't answering my initial question for PROC GLM, but is an option if PROC REG gets the job done for your type of model.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top