Question

I am trying to build a random forest model in R (RStudio). My training dataset has around 2 million rows and 38 variables. When I tested 5000 rows from this dataset I was able to build the random forest but when I run on the whole dataset I get the following error:

Error in randomForest.default(m, y, ...) : long vectors (argument 24) are not supported in .C

Can anyone please suggest, apart from removing the number of rows, how can I fix this? Can I run multiple random forests and then combine them into one? If yes, can someone please recommend how can I try this?

Many thanks in advance.

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top