Question

What are some of the systematic ways to categorise variables into categorical or numeric? I believe using only intuition in such scenarios can many-a-times lead to major irreversible errors. What are the best strategies when categorising variables?

For example, the dataframe I'm working has several categorical variables such as is_holiday that has labels for several holidays. However certain variables like visibility_in_miles suggest that those too need to be treated as categorical. part of the reason is that while most variables have hundreds of unique values, some have only 9 points.

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top