Question

Here is a quotation from a Stata online tutorial: If you want to interact a continuous variable with a factor variable, just prefix the continuous variable with c. http://www.stata.com/capabilities/overview/factor-variables/

They give the following example: smoker#c.bmi.

smoker is a categorical variable, coded as 1 non-smoker, 2 smoker, 3 heavy smoker.

bmi is a continuous variable, body mass index.

When they create interaction term smoker#c.bmi, what does it show and how is it to be interpreted?

Was it helpful?

Solution

It seems to me that smoker is a dummy variable (1/0) [please see the note below]. Please double check the following sentence:

We run a linear regression of cholesterol level on a full factorial of age group and whether the person smokes along with a continuous body mass index (bmi) and its interaction with whether the person smokes [emphasis]

cholesterol = -0.517 smoker + 0.03455 bmi + 0.0245 bmi*smoker + other parts

The coefficient on bmi indicates the impact of bmi for non-smoker whereas the coefficient on bmi*smoker gives the incremental impact of bmi for smoker (i.e. for smoker it is 0.03455 + 0.0245 vs 0.03455 for non-smoker). The significance of the interaction term indicates that impact on cholesterol of bmi is higher for smokers than for non-smokers.

Note: There are three categories of age group, not three categories of smokers.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top