I'm transforming an array to a data frame and I want to use random forest in that data frame. The problem is that I'm getting to much output from the predict.

This is a similar example I created to reproduce the problem:

matTest <- array(1:5120, dim=c(10,512))
dataTest <- data.frame(matTest)
dataTest$y <- 1:10
TEST.rf <- randomForest(y ~ ., dataTest)
predict(TEST.rf, data=dataTest[1,])

the output from predict is

       1        2        3        4        5        6        7        8        9       10 
3.308430 2.778164 2.749053 3.093386 4.027957 5.143252 6.873542 7.707022 7.902198 7.621082 

but I should be getting only a numeric value from the predict, since every line should be an individual sample.

I don't know what I'm doing wrong...

有帮助吗?

解决方案

You should check ?predict.randomForest to make sure that you know the names of the arguments of the function you intend to use.

You should be using newdata = ... instead.

Since data doesn't match any of the named arguments, it is passed on to ... and then ignored, which means that you get back the default: the out-of-bag predictions for the original data set.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top