DrDubWiki | Feature Discretization Library

While writing my feature engineering book, I learned many intelligent feature discretization techniques developed during the 1990s and earlier times. When computers had limited amount of memory, reducing the granularity of features resulted in important memory savings. For details, see

Garcia, Salvador, et al. "A survey of discretization techniques: Taxonomy and empirical analysis in supervised learning." IEEE transactions on Knowledge and Data Engineering 25.4 (2012): 734-750.

Since then, the interest in discretization has waned, which seems a mistake. Supervised discretization (the type of discretization that takes into account the target class) can potentially improve the signal-to-noise ratio by reducing nuisance variations present in the data.

Making a library or a contribution to existing frameworks (e.g., going beyond KBinsDiscretizer in scikit learn) can be of use to the field.

Backlinks

Page actions

Feature Discretization Library

System Menu