Question

I'm trying to use caret to find the best parameters for a gbm model. This code is identical to what I've used on other data sets and can't figure out the error.

It seems to run the model but can't create predictions.

predictions failed for Fold2: interaction.depth=4, shrinkage=0.005, n.trees=200 Error in apply(tmp, 2, function(x, nm = modelFit$obsLevels) ifelse(x >=  : 
  dim(X) must have a positive length

Here's the full code:

library(caret)
library(gbm)

myControl <- trainControl(method='cv', number=2, summaryFunction=twoClassSummary,
                          classProbs=TRUE, savePredictions=TRUE, verboseIter=TRUE)


df1 <- data.frame(Y = round(runif(1000), 0), x1=runif(1000), x2=runif(1000) )

X <- df1[,c('x1','x2')]
Y <- factor(paste('X', df1[,'Y']))


gbm_model <- train(X, Y, method='gbm', metric='ROC', trControl=myControl 
                   ,distribution='bernoulli', tuneGrid=expand.grid(.n.trees=seq(100, 200, by=100) 
                   ,.interaction.depth=seq(2, 4, by=2), .shrinkage=c(.005)))

Any suggestions?

EDIT: I'm using gbm 2.1 and caret 5.16.24

Was it helpful?

Solution

That is a bug. I had a new version of caret to submit today but I'll make these changes prior to sending it in.

There is a little disconnect between your code and output. I get an additional warning: "At least one of the class levels are not valid R variables names; This may cause errors if class probabilities are generated because the variables names will be converted to: X.0, X.1". Add a sep = "" in the paste command and it goes away.

Max

OTHER TIPS

You might also get this error if you try to run a caret model and mistakenly just included a dependent variable with only one factor in it.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top