Using mlogit in R with variables that only apply to certain alternatives

Question 1

I am not well versed in the various implementations of logit models, but I imagine it has to do with making sure you have variation across persons and alternatives to the matrix can be properly determined with variation across alternatives and choosers. What do you get from saying

amres<-mlogit(mode~distance| nveh | ivt+board,data=AMLOGIT)

mlogit has a group separation between the pipes, as I understand it as follows: first part is your basic formula, the second part is variables that don't vary across alternatives (i.e. are only person specific, gender, income--I think nveh should be here) while the third part varies by alternative.

Ken Train, incidentally, has a set of vignettes on mlogit specifically that might be helpful. Viton mentions the partition with pipes.

Ken Train's Vignettes

Philip Viton's Vignettes

Yves Croissant's Vignettes

Question 2

Looks like you may have perfect separation. Have you checked this by e.g. looking at crosstables of the variables? (Can't fit a model if one combination of predictors allows for perfect prediction...) Would be helpful to know size of dataset in this regard - you may be over-fitting for the amount of data you have. This is a general problem in modelling, not specific to mlogit.

You say "the stats look great" but values for Pr(>|t|)s and the Likelihood ratio test look implausibly significant, which would be consistent with this problem. This means the estimates of the coefficients are likely to be inaccurate. (Are they similar to the coefficients produced by univariate modelling ?). Perhaps a simpler model would be more appropriate.

Edit @user3092719 :

You're fitting a generalized linear model, which can easily be overfit (as the outcome variable is discrete or nominal - i.e. has a restricted no. of values). mlogit is an extension of logistic regression; here's a simple example of the latter to illustrate:

> df1 <- data.frame(x=c(0, rep(1, 3)),
                    y=rep(c(0, 1), 2))
> xtabs( ~ x + y, data=df1)
   y
x   0 1
  0 1 0
  1 1 2

Note the zero in the top right corner. This shows 'perfect separation' which means you that if x=0 you know for sure that y=0 based on this set. So a probabilistic predictive model doesn't make much sense. Some output from

> summary(glm(y ~ x, data=df1, binomial(link = "logit")))

gives

Coefficients:
            Estimate Std. Error z value Pr(>|z|)
(Intercept)   -18.57    6522.64  -0.003    0.998
x              19.26    6522.64   0.003    0.998

Here the size of the Std. Errors are suspiciously large relative to the value of the coefficients. You should also be alerted by Number of Fisher Scoring iterations: 17 - the large no. iterations needed to fit suggests numerical instability.

Your solution seems to involve ensuring that this problem of complete separation does not occur in your model, although hard to be sure without having a minimal working example.