Pergunta

I am trying to use train() in caret to fit a classification model, but I'm hitting some kind of unhandled exception and my R session crashes before outputting any error information in the R console.
Windows error:

R for Windows terminal front-end has stopped working

I am running Windows 7, R 3.0.2, caret 6.0-21, and have tried running this on both 32/64 versions of R, in R Studio and also directly in the R console, and am getting the same results each time.

Here is my call to train:

library("AppliedPredictiveModeling")
library("caret")

data("AlzheimerDisease")
data <- data.frame(predictors, diagnosis)

tuneGrid <- expand.grid(interaction.depth = 1:2, n.trees = 100, shrinkage = 0.1)
trainControl <- trainControl(method = "cv", number = 5, verboseIter = TRUE)

gbmFit <- train(diagnosis ~ ., data = data, method = "gbm", trControl = trainControl, tuneGrid = tuneGrid)

There are no more errors using this parameter grid instead:

tuneGrid <- expand.grid(interaction.depth = 1, n.trees = 100:101, shrinkage = 0.1)

However, I am still getting all nans in the ValidDeviance column. Is this normal?

Note: My original problem is resolved, and this is a continuation from the comments section. Formatting blocks of code in the comments section is unreadable so I'm posting it up here. This is no longer a question regarding caret, but gbm instead.

I am still having issues, however, with direct calls to gbm using a single predictor with cv.folds specified. Here is the code:

library("AppliedPredictiveModeling")
library("caret")

data("AlzheimerDisease")
diagnosis <- as.numeric(diagnosis)
diagnosis[diagnosis == 1] <- 0
diagnosis[diagnosis == 2] <- 1
data <- data.frame(diagnosis, predictors[, 1])
gbmFit <- gbm(diagnosis ~ ., data = data, cv.folds = 5)

Again, this works without specifying cv.folds but with it, returns an error:

Error in checkForRemoteErrors(val) :  5 nodes produced errors; first error: incorrect number of dimensions
Foi útil?

Solução

It is a bug that occurs when method = 'gbm' is used with a single model (i.e. nrow(tuneGrid) == 1). I'm about to release a new version, so I will fix this in that version.

One side note... it looks like you want to do classification. In that case, y should be a factor (and you shouldn't use only integers as the classes) otherwise it will be doing regression. These changes will work for now:

 y <- factor(paste("Class", y, sep = ""))

and

 tuneGrid <- expand.grid(interaction.depth = 1, 
                         n.trees = 100:101, 
                         shrinkage = 0.1)

Thanks,

Max

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top