From Shapley Values to Generalized Additive Models and back
In explainable machine learning, local post-hoc explanation algorithms and inherently interpretable models are often seen as competing approaches. In this work, offer a novel perspective on Shapley Values, a prominent post-hoc explanation technique, and show that it is strongly connected with Glassbox-GAMs, a popular class of interpretable models. We introduce n-Shapley Values, a natural extension of Shapley Values that explain individual predictions with interaction terms up to order n. As n increases, the n-Shapley Values converge towards the Shapley-GAM, a uniquely determined decomposition of the original function. From the Shapley-GAM, we can compute Shapley Values of arbitrary order, which gives precise insights into the limitations of these explanations. We then show that Shapley Values recover generalized additive models of order n, assuming that we allow for interaction terms up to order n in the explanations. This implies that the original Shapley Values recover Glassbox-GAMs. At the technical end, we show that there is a one-to-one correspondence between different ways to choose the value function and different functional decompositions of the original function. This provides a novel perspective on the question of how to choose the value function. We also present an empirical analysis of the degree of variable interaction that is present in various standard classifiers, and discuss the implications of our results for algorithmic explanations. A python package to compute n-Shapley Values and replicate the results in this paper is available at <https://github.com/tml-tuebingen/nshap>.
READ FULL TEXT