質問

I have to compare Support Vector Machine and Random Forest algorithm , but i'm confused how it can be compared, like support vector machine is supervised learning algorithm and random forest is ensemble learning . Help me out how i can compare it on which point like - in clasification , in regression .

役に立ちましたか?

解決

TL;DR

Since both SVM and Random Forest are supervised algorithms, you can compare the two like you would compare any other two supervised algorithms.

The fact that a Random Forest is an ensemble classifier doesn't really matter as long as you treat all trees in the forest as a single model.

Comparing two supervised algorithms

The simplest way to compare supervised algorithms is with a train/test split:

  1. Split all your data into two sets, namely a training and a testing set (a common ratio is 0.8/0.2).
  2. Train both models independently with the data from the training set.
  3. Use your models to predict the data from the testing set.
  4. Give a score to the predictions by comparing what the model predicted vs the true value from the testing set. If you have a classification problem, you could use the F1 score. If you have a regression problem, you could use the R-square score.
  5. Pick the model with the best score.

Other ways of comparing two supervised algorithms

  • Instead of a train/test split, you could look into cross-validation.
  • Instead of a random train/test split, you could look into stratified or time splits.
  • Instead of comparing two different algorithms, you could compare the same algorithm against itself but with different hyper-parameters (i.e. Hyper-parameter optimization).
  • Instead of F1 score or R-squared you could use a metric that best fits your business case.
ライセンス: CC-BY-SA帰属
所属していません datascience.stackexchange
scroll top