Pergunta

I want to use the glmnet in R to do classification problems.

The sample data is as follows:

y,x1,x2,x3,x4,x5,x6,x7,x8,x9,x10,x11
1,0.766126609,45,2,0.802982129,9120,13,0,6,0,2
0,0.957151019,40,0,0.121876201,2600,4,0,0,0,1
0,0.65818014,38,1,0.085113375,3042,2,1,0,0,0

y is a binary response (0 or 1).

I used the following R code:

prr=cv.glmnet(x,y,family="binomial",type.measure="auc")
yy=predict(prr,newx, s="lambda.min")

However, the predicted yy by glmnet is scattered between [-24,5].

How can I restrict the output value to [0,1] thus I use it to do classification problems?

Foi útil?

Solução

I have read the manual again and found that type="response" in predict method will produce what I want:

lassopre2=predict(prr,newx, type="response")

will output values between [0,1]

Outras dicas

A summary of the glmnet path at each step is displayed if we just enter the object name or use the print function:

  print(fit)

  ## 
  ## Call:  glmnet(x = x, y = y) 
  ## 
  ##       Df   %Dev  Lambda
  ##  [1,]  0 0.0000 1.63000
  ##  [2,]  2 0.0553 1.49000
  ##  [3,]  2 0.1460 1.35000
  ##  [4,]  2 0.2210 1.23000

It shows from left to right the number of nonzero coefficients (Df), the percent (of null) deviance explained (%dev) and the value of λ

(Lambda). Although by default glmnet calls for 100 values of lambda the program stops early if `%dev% does not change sufficently from one lambda to the next (typically near the end of the path.)

We can obtain the actual coefficients at one or more λ

’s within the range of the sequence:

  coef(fit,s=0.1)

  ## 21 x 1 sparse Matrix of class "dgCMatrix"
  ##                     1
  ## (Intercept)  0.150928
  ## V1           1.320597
  ## V2           .       
  ## V3           0.675110
  ## V4           .       
  ## V5          -0.817412

Here is the original explanation for more information by Hastie

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top