Adjusted predictions running ANOVA in Stata

https://stackoverflow.com/questions/14354143

stata
anova

16-01-2022
|

Question

I have an outcome variable x and three explanatory variables a, b, c which are categorical variables. In my example a has 8 levels, b has 4 levels and c has 35 levels, but not all combinations of the three variables has observations (this is probably unimportant).

If I run the following additive ANOVA model in Stata

anova x a b c
adjust, by(a b) gen(y)

then I obtain predictions of the variable x adjusted by the variables a and b. The adjust command outputs the following table in the Result window, and also it generates a variable y with adjusted predictions.

      |                 b                 
    a |       2        4        8       16
----------+-----------------------------------
   50 | .016655  .018487                  
   75 | .008286  .011237                  
  100 | .005937  .006677  .012467         
  150 | .001905  .004038  .009454         
  200 | .001774  .003107  .007592  .010081
  400 |          .004982  .006853  .009342
  800 |                   .002126   .00521
 1000 |                   .002732  .005221
----------------------------------------------
 Key:  Linear Prediction

My problem is that the variable y has a value for each combination of a, b and c while the table above only has values for each combination of a and b. How can I save the results from the table, so I'm able to work with these? What is the connection between the values in the table and the values in y?

Thanks in advance.

Update: I found this in help adjust:

Variables used in the estimation command but not included in either the by() variable list or the adjust variable list are left at their current values, observation by observation. Here adjust displays the average estimated prediction (or the corresponding probability or exponentiated prediction), substituting the mean of these unspecified variables within each group defined by the variables in the by() option.

This is also true for my data. For example if a=75 and b=2, then c takes on the values 12,13,14,15,16. The value of y corresponding to c=14 (which is the average) is exactly what is displayed in the table. But what if the average of the values is not a value that it takes on?

Solution

This is a reply to Stefan Hansen's comment on adjust and margins.

In general, no; but everything depends on the model and whether there are covariates other than those named. But consider the results of

sysuse auto, clear 
anova mpg foreign rep78
adjust, by(foreign rep78)
margins foreign#rep78

Here the results do coincide.

I am not fluent in margins beyond elementary uses, so any more complicated questions will need to be handled by someone else.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow