Looks like you may have perfect separation. Have you checked this by e.g. looking at crosstables of the variables? (Can't fit a model if one combination of predictors allows for perfect prediction...) Would be helpful to know size of dataset in this regard - you may be over-fitting for the amount of data you have. This is a general problem in modelling, not specific to mlogit
.
You say "the stats look great" but values for Pr(>|t|)
s and the Likelihood ratio test
look implausibly significant, which would be consistent with this problem. This means the estimates of the coefficients are likely to be inaccurate. (Are they similar to the coefficients produced by univariate modelling ?). Perhaps a simpler model would be more appropriate.
Edit @user3092719 :
You're fitting a generalized linear model, which can easily be overfit (as the outcome variable is discrete or nominal - i.e. has a restricted no. of values). mlogit
is an extension of logistic regression; here's a simple example of the latter to illustrate:
> df1 <- data.frame(x=c(0, rep(1, 3)),
y=rep(c(0, 1), 2))
> xtabs( ~ x + y, data=df1)
y
x 0 1
0 1 0
1 1 2
Note the zero in the top right corner. This shows 'perfect separation' which means you that if x=0
you know for sure that y=0
based on this set. So a probabilistic predictive model doesn't make much sense.
Some output from
> summary(glm(y ~ x, data=df1, binomial(link = "logit")))
gives
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -18.57 6522.64 -0.003 0.998
x 19.26 6522.64 0.003 0.998
Here the size of the Std. Error
s are suspiciously large relative to the value of the coefficients. You should also be alerted by Number of Fisher Scoring iterations: 17
- the large no. iterations needed to fit suggests numerical instability.
Your solution seems to involve ensuring that this problem of complete separation does not occur in your model, although hard to be sure without having a minimal working example.