How to compare supervised learning algorithm and it's technique ensemble learning algorithm?
-
10-12-2020 - |
質問
I have to compare Support Vector Machine and Random Forest algorithm , but i'm confused how it can be compared, like support vector machine is supervised learning algorithm and random forest is ensemble learning . Help me out how i can compare it on which point like - in clasification , in regression .
解決
TL;DR
Since both SVM and Random Forest are supervised algorithms, you can compare the two like you would compare any other two supervised algorithms.
The fact that a Random Forest is an ensemble classifier doesn't really matter as long as you treat all trees in the forest as a single model.
Comparing two supervised algorithms
The simplest way to compare supervised algorithms is with a train/test split:
- Split all your data into two sets, namely a training and a testing set (a common ratio is 0.8/0.2).
- Train both models independently with the data from the training set.
- Use your models to predict the data from the testing set.
- Give a score to the predictions by comparing what the model predicted vs the true value from the testing set. If you have a classification problem, you could use the F1 score. If you have a regression problem, you could use the R-square score.
- Pick the model with the best score.
Other ways of comparing two supervised algorithms
- Instead of a train/test split, you could look into cross-validation.
- Instead of a random train/test split, you could look into stratified or time splits.
- Instead of comparing two different algorithms, you could compare the same algorithm against itself but with different hyper-parameters (i.e. Hyper-parameter optimization).
- Instead of F1 score or R-squared you could use a metric that best fits your business case.
所属していません datascience.stackexchange