Question

I am currently reading an introductory machine learning book by Daumé (ch. 03, p. 30) and when discussing the mapping of categorical features with "n" possible values into "n" binary indicator features, the following question is proposed:

Is it a good idea to map a categorical feature with n values to log2(n) binary features?

Why wouldn't that be the case, seeing as how much resources could be spared by working with fewer features? Does this approach depend on the model that is being used?

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top