Domanda

I'm transforming an array to a data frame and I want to use random forest in that data frame. The problem is that I'm getting to much output from the predict.

This is a similar example I created to reproduce the problem:

matTest <- array(1:5120, dim=c(10,512))
dataTest <- data.frame(matTest)
dataTest$y <- 1:10
TEST.rf <- randomForest(y ~ ., dataTest)
predict(TEST.rf, data=dataTest[1,])

the output from predict is

       1        2        3        4        5        6        7        8        9       10 
3.308430 2.778164 2.749053 3.093386 4.027957 5.143252 6.873542 7.707022 7.902198 7.621082 

but I should be getting only a numeric value from the predict, since every line should be an individual sample.

I don't know what I'm doing wrong...

È stato utile?

Soluzione

You should check ?predict.randomForest to make sure that you know the names of the arguments of the function you intend to use.

You should be using newdata = ... instead.

Since data doesn't match any of the named arguments, it is passed on to ... and then ignored, which means that you get back the default: the out-of-bag predictions for the original data set.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top