Which Shortcut Cues Will DNNs Choose? A Study from the Parameter-Space Perspective

10/06/2021
by   Luca Scimeca, et al.
6

Deep neural networks (DNNs) often rely on easy-to-learn discriminatory features, or cues, that are not necessarily essential to the problem at hand. For example, ducks in an image may be recognized based on their typical background scenery, such as lakes or streams. This phenomenon, also known as shortcut learning, is emerging as a key limitation of the current generation of machine learning models. In this work, we introduce a set of experiments to deepen our understanding of shortcut learning and its implications. We design a training setup with several shortcut cues, named WCST-ML, where each cue is equally conducive to the visual recognition problem at hand. Even under equal opportunities, we observe that (1) certain cues are preferred to others, (2) solutions biased to the easy-to-learn cues tend to converge to relatively flat minima on the loss surface, and (3) the solutions focusing on those preferred cues are far more abundant in the parameter space. We explain the abundance of certain cues via their Kolmogorov (descriptional) complexity: solutions corresponding to Kolmogorov-simple cues are abundant in the parameter space and are thus preferred by DNNs. Our studies are based on the synthetic dataset DSprites and the face dataset UTKFace. In our WCST-ML, we observe that the inborn bias of models leans toward simple cues, such as color and ethnicity. Our findings emphasize the importance of active human intervention to remove the inborn model biases that may cause negative societal impacts.

READ FULL TEXT

page 1

page 3

page 5

page 9

page 10

page 11

page 12

page 13

research
01/23/2020

Deep Bayesian Network for Visual Question Generation

Generating natural questions from an image is a semantic task that requi...
research
10/14/2019

Pathological spectra of the Fisher information metric and its variants in deep neural networks

The Fisher information matrix (FIM) plays an essential role in statistic...
research
04/23/2021

Learning to Learn to be Right for the Right Reasons

Improving model generalization on held-out data is one of the core objec...
research
09/29/2018

Interpreting Adversarial Robustness: A View from Decision Surface in Input Space

One popular hypothesis of neural network generalization is that the flat...
research
06/24/2020

Retrospective Loss: Looking Back to Improve Training of Deep Neural Networks

Deep neural networks (DNNs) are powerful learning machines that have ena...
research
08/12/2021

DiagViB-6: A Diagnostic Benchmark Suite for Vision Models in the Presence of Shortcut and Generalization Opportunities

Common deep neural networks (DNNs) for image classification have been sh...
research
11/19/2022

Quantifying Human Bias and Knowledge to guide ML models during Training

This paper discusses a crowdsourcing based method that we designed to qu...

Please sign up or login with your details

Forgot password? Click here to reset