Frank,
This is really similar to your other question on Cross Validated.
You really need to
1) show your exact prediction code for each result
2) give us a reproducible example.
With the normal testSet
, RF.CS
and RF.CS$finalModel
should not be giving you the same results and we should be able to reproduce that. Plus, there are syntax errors in your code so it can't be exactly what you executed.
Finally, I'm not really sure why you would use the finalModel
object at all. The point of train
is to handle the details and doing things this way (which is your option) circumvents the complete set of code that would normally be applied.
Here is a reproducible example:
library(mlbench)
data(Sonar)
set.seed(1)
inTrain <- createDataPartition(Sonar$Class)
training <- Sonar[inTrain[[1]], ]
testing <- Sonar[-inTrain[[1]], ]
pp <- preProcess(training[,-ncol(Sonar)])
training2 <- predict(pp, training[,-ncol(Sonar)])
training2$Class <- training$Class
testing2 <- predict(pp, testing[,-ncol(Sonar)])
testing2$Class <- testing2$Class
tc <- trainControl("repeatedcv",
number=10,
repeats=10,
classProbs=TRUE,
savePred=T)
set.seed(2)
RF <- train(Class~., data= training,
method="rf",
trControl=tc)
#normal trainingData
set.seed(2)
RF.CS <- train(Class~., data= training,
method="rf",
trControl=tc,
preProc=c("center", "scale"))
#scaled and centered trainingData
Here are some results:
> ## These should not be the same
> all.equal(predict(RF, testing, type = "prob")[,1],
+ predict(RF, testing2, type = "prob")[,1])
[1] "Mean relative difference: 0.4067554"
>
> ## Nor should these
> all.equal(predict(RF.CS, testing, type = "prob")[,1],
+ predict(RF.CS, testing2, type = "prob")[,1])
[1] "Mean relative difference: 0.3924037"
>
> all.equal(predict(RF.CS, testing, type = "prob")[,1],
+ predict(RF.CS$finalModel, testing, type = "prob")[,1])
[1] "names for current but not for target"
[2] "Mean relative difference: 0.7452435"
>
> ## These should be and are close (just based on the
> ## random sampling used in the final RF fits)
> all.equal(predict(RF, testing, type = "prob")[,1],
+ predict(RF.CS, testing, type = "prob")[,1])
[1] "Mean relative difference: 0.04198887"
Max