Optimal Split for Categorical Features

It is common to represent categorical features with one-hot encoding, but this approach is suboptimal for tree learners. Particularly for high-cardinality categorical features, a tree built on one-hot features tends to be unbalanced and needs to grow very deep to achieve good accuracy. Instead of one-hot encoding, the optimal solution is to split on a… Continue reading Optimal Split for Categorical Features

Published
Categorized as Preprocess