Multivariate Decision Tree learner

https://stackoverflow.com/questions/2484761

21-09-2019
|

Question

A lot univariate decision tree learner implementations (C4.5 etc) do exist, but does actually someone know multivariate decision tree learner algorithms?

Solution

Bennett and Blue's A Support Vector Machine Approach to Decision Trees does multivariate splits by using embedded SVMs for each decision in the tree.

Similarly, in Multicategory classification via discrete support vector machines (2009) , Orsenigo and Vercellis embed a multicategory variant of discrete support vector machines (DSVM) into the decision tree nodes.

OTHER TIPS

CART algorithm for decisions tree can be made into a Multivariate. CART is a binary splitting algorithm as opposed to C4.5 which creates a node per unique value for discrete values. They use the same algorithm for MARS as for missing values too.

To create a Multivariant tree you compute the best split at each node, but instead of throwing away all splits that weren't the best you take a portion of those (maybe all), then evaluate all of the data's attributes by each of the potential splits at that node weighted by the order. So the first split (which lead to the maximum gain) is weighted at 1. Then the next highest gain split is weighted by some fraction < 1.0, and so on. Where the weights decrease as the gain of that split decreases. That number is then compared to same calculation of the nodes within the left node if it's above that number go left. Otherwise go right. That's pretty rough description, but that's a multi-variant split for decision trees.

Yes, there are some, such as OC1, but they are less common than ones which make univariate splits. Adding multivariate splits expands the search space enormously. As a sort of compromise, I have seen some logical learners which simply calculate linear discriminant functions and add them to the candidate variable list.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow