Question

I am running a fully interacted linear regression (that is, I include all two-way interactions of all variables), and I now want to output the variable names together with the coefficient estimates. A minimal example looks like this:

y = randn(100,1); %Dependent variable
X = randn(100,3); %Independent variables
names = {'var1', 'var2', 'var3'};

Xint = x2fx(X, 'interaction'); %Construct interaction terms

res = regstats(y,Xint);

Now, the documenation of x2fx() states that it constructs its output in the following order:

The constant term

The linear terms (the columns of X, in order 1, 2, ..., n)

The interaction terms (pairwise products of the columns of X, in order (1, 2), (1, 3), ..., (1, n), (2, 3), ..., (n–1, n))

I now want to construct a cell with the variable names in the same order, that is, (for the minimal example) I want

allVars = {'cons', 'var1', 'var2', 'var3','var1xvar2','var1xvar3','var2xvar3'};

I tried to do this using ndgrid as in here:

temp1 = 1:3; % Vector of variable indices
[x, y] = ndgrid(temp1, temp1);
allvarnames = [x(:) y(:)];

but this has all interacted variables twice (e.g. once as 2-1 and once as 1-2), and the order is still wrong. I could proceed along these lines, although it might be somewhat messy, but I was wondering whether someone knows a simpler solution to this.

Thanks in advance, Tom

Was it helpful?

Solution

Maybe use triu (upper triangular matrix)?

T = triu(ones(length(names))); %a mask for cross terms to keep
for i = 1:length(names)
    for j = 1:length(names)        
        if T(i,j) && (i ~= j)
            fprintf('%sx%s\n',names{i},names{j});
        end
    end
end

This just prints off the cross terms; you'd have to save them off into your allvarnames array, along with prepending with the constant and linear term variable names.

var1xvar2
var1xvar3
var2xvar3

Note that this works no matter the length of 'names.'

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top