Question

I am trying out logistic regression on a dataset I have

model <- glm(feature1 ~ feature2, data=df, family="binomial")

But glm does something unexpected. It is taking all values of "feature2" as variables and assigns them coeff in the logit parameters in the model.

Output of summary(model) :

> summary(model)

Call:

glm(formula = feature1 ~ price, family = binomial(logit), data = df)

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-2.22931  -0.00008   0.00008   0.82033   1.97277  

Coefficients:
               Estimate Std. Error z value Pr(>|z|)
(Intercept)   6.931e-01  1.225e+00   0.566    0.571
price0.06     1.887e+01  1.075e+04   0.002    0.999
price0.1     -6.931e-01  1.871e+00  -0.371    0.711
price0.11     1.887e+01  1.075e+04   0.002    0.999
price0.2      1.887e+01  1.075e+04   0.002    0.999
price0.9      1.887e+01  1.075e+04   0.002    0.999
price0.99     1.092e-01  1.269e+00   0.086    0.931
price1        1.253e+00  1.626e+00   0.771    0.441
price1.01     1.887e+01  1.075e+04   0.002    0.999
price1.02     1.887e+01  1.075e+04   0.002    0.999
price1.04     1.887e+01  1.075e+04   0.002    0.999

> typeof(nonNPpriceDf$price)
[1] "integer"

I want price to be just a predictor variable. I am not able to understand why all the prices are appended and treated as a variable.

Was it helpful?

Solution

It was a confusion between typeof and class methods. The typeof factor2 was Integer but the class was factor. I converted factor2 to numeric and it worked fine as expected.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top