Robust Ante-hoc Graph Explainer using Bilevel Optimization

by   Mert Kosan, et al.

Explaining the decisions made by machine learning models for high-stakes applications is critical for increasing transparency and guiding improvements to these decisions. This is particularly true in the case of models for graphs, where decisions often depend on complex patterns combining rich structural and attribute data. While recent work has focused on designing so-called post-hoc explainers, the question of what constitutes a good explanation remains open. One intuitive property is that explanations should be sufficiently informative to enable humans to approximately reproduce the predictions given the data. However, we show that post-hoc explanations do not achieve this goal as their explanations are highly dependent on fixed model parameters (e.g., learned GNN weights). To address this challenge, this paper proposes RAGE (Robust Ante-hoc Graph Explainer), a novel and flexible ante-hoc explainer designed to discover explanations for a broad class of graph neural networks using bilevel optimization. RAGE is able to efficiently identify explanations that contain the full information needed for prediction while still enabling humans to rank these explanations based on their influence. Our experiments, based on graph classification and regression, show that RAGE explanations are more robust than existing post-hoc and ante-hoc approaches and often achieve similar or better accuracy than state-of-the-art models.


page 2

page 8


Minimalistic Explanations: Capturing the Essence of Decisions

The use of complex machine learning models can make systems opaque to us...

Eliminating The Impossible, Whatever Remains Must Be True

The rise of AI methods to make predictions and decisions has led to a pr...

Self-learn to Explain Siamese Networks Robustly

Learning to compare two objects are essential in applications, such as d...

When less is more: Simplifying inputs aids neural network understanding

How do neural network image classifiers respond to simpler and simpler i...

Can I Trust the Explainer? Verifying Post-hoc Explanatory Methods

For AI systems to garner widespread public acceptance, we must develop m...

AdvisingNets: Learning to Distinguish Correct and Wrong Classifications via Nearest-Neighbor Explanations

Besides providing insights into how an image classifier makes its predic...

Influence-Directed Explanations for Deep Convolutional Networks

We study the problem of explaining a rich class of behavioral properties...

Please sign up or login with your details

Forgot password? Click here to reset