Question

I've got two factor variables, one is coded as numeric and the other is coded as a character string. Call them C and N. I want to include their interaction in a regression (which would expand them into dummies. In R I would code

lm(y~as.factor(C)*as.factor(N)) or

library(plm)
C = as.factor(C)
N = as.factor(N)
plm(y~C:N, index=c('C','N'), effect="twoways")

In stata, I want to do something like

xtset C N
xtreg y c*N, fe

what is the syntax for doing this?

Was it helpful?

Solution

The string variable you must convert to numeric. encode is one option. Then use Stata's factor variable notation (i.e. #). A nonsensical example:

clear all
set more off

sysuse auto
describe
keep price mpg make

encode make, gen(make2)
regress price mpg c.mpg#i.make2

Factor variable notation was introduced precisely with Stata 11.

Type help factor variables, help encode, for the details.

Note: I have not tried to translate your R code to Stata.

OTHER TIPS

# doesn't work in xtabond. See the similar question here on Statalist. Here is quick and dirty way to solve that in Stata for your real problem:

webuse abdata
tabulate ind,gen(ind) # industry dummies 
tabulate year,gen(yr) # this is not needed because it is already in the dataset
egen ind_year=group(ind year) # interaction of year and ind or gen ind_year=ind*year works
tabulate ind_year,gen(ind_year) # interaction dummies
xtabond n l(0/1).w  ind2-ind9 yr1977-yr1984 ind_year2-ind_year80

Note: in R you can use interact for group in Stata.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top