Question

I am going nuts trying to figure this out. How can I in R, define the reference level to use in a binary logistic regression? What about the multinomial logistic regression? Right now my code is:

logistic.train.model3 <- glm(class~ x+y+z,
                         family=binomial(link=logit), data=auth, na.action = na.exclude)

my response variable is "YES" and "NO". I want to predict the probability of someone responding with "YES".

I DO NOT want to recode the variable to 0 / 1. Is there a way I can tell the model to predict "YES" ?

Thank you for your help.

Was it helpful?

Solution 2

Assuming you have class saved as a factor, use the relevel() function:

auth$class <- relevel(auth$class, ref = "YES")

OTHER TIPS

Note that, when using auth$class <- relevel(auth$class, ref = "YES"), you are actually predicting "NO".

To predict "YES", the reference level must be "NO". Therefore, you have to use auth$class <- relevel(auth$class, ref = "NO").

It's a common mistake people do since most the time their oucome variable is a vector of 0 and 1, and people want to predict 1.

But when such a vector is considered as a factor variable, the reference level is 0 (see below) so that people effectively predict 1. Likewise, your reference level must be "NO" so that you will predict "YES".

set.seed(1234)
x1 <- sample(c(0, 1), 50, replace = TRUE)
x2 <- factor(x1)
str(x2)
#Factor w/ 2 levels "0","1": 1 2 2 2 2 2 1 1 2 2 ...You can see that reference level is 0
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top