Question

I am using XGBoost for a machine learning model that learns from tabular data.

XGBoost uses boosting method on decision trees. When I look at the decision-making logic of decision trees, I notice the logic is based on 1 feature at one time. In real life, certain multiple features are related to each other.

Currently, when I feed data to the model, I simply feed all the features to it without telling the model how certain features are related to each other.

Let me describe a hypothetical example to be clearer. Suppose I have 2 features - gender and length of hair. In this hypothetical problem, I know from my domain knowledge that if gender is female, length of hair matters in determining the outcome. If gender is male, length of hair is irrelevant. How do I tell the machine learning model this valuable piece of information so that the model can learn better?

I am using XGBoost on python 3.7

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top