Loading...
 

NLG4SHAP

This idea is part of the A Dollar Worth of Ideas series, with potential open source, research or data science projects or contributions for people to pursue. I would be interested in mentoring some of them. Just contact me for details.


Introduced in 2017, SHAP (SHapley Additive exPlanations) has become a popular tool for feature explanations.

The current approach involves visualizing them in a variety of plots (see Christoph Molnar's Interpretable Machine Learning book for some examples). At the instance level, a plot like this from my feature engineering in tourism book chapter:

Shap

The plots are visually beautiful and for small number of features they can be very informative. But when the model has a large number of similarly contributing features and their names are cumbersome, the plots get truncated and lose efficacy.

Another technique to describe the data can be used, using NLG and potentially reusing some ideas and technology from the Thoughtland project. It might be even possible to extend the descriptions to reduce the risk of adversarial attacks on SHAP.