Overoptimization Failures and Specification Gaming in Multi-agent Systems

10/16/2018
by   David Manheim, et al.
0

Overoptimization failures in machine learning and AI can involve specification gaming, reward hacking, fragility to distributional shifts, and Goodhart's or Campbell's law. These failure modes are an important challenge in building safe AI systems, but multi-agent systems have additional related failure modes. These failure modes are more complex, more problematic, and less well understood in the multi-agent setting, at least partially because they are not yet observed in practice. This paper explains why this is the case, then lays out some of the classes of such failure, such as accidental steering, coordination failures, adversarial misalignment, input spoofing or filtering, and goal co-option or direct hacking.

READ FULL TEXT
research
10/16/2018

Multiparty Dynamics and Failure Modes for Machine Learning and Artificial Intelligence

Overoptimization failures in machine learning and artificial intelligenc...
research
06/30/2020

Robust Multi-Agent Task Assignment in Failure-Prone and Adversarial Environments

The problem of assigning agents to tasks is a central computational chal...
research
09/18/2022

Autonomous Task Planning for Heterogeneous Multi-Agent Systems

This paper presents a solution to the automatic task planning problem fo...
research
03/13/2018

Categorizing Variants of Goodhart's Law

There are several distinct failure modes for overoptimization of systems...
research
09/13/2019

An Alert-Generation Framework for Improving Resiliency in Human-Supervised, Multi-Agent Teams

Human-supervision in multi-agent teams is a critical requirement to ensu...
research
09/19/2018

Towards Accountable AI: Hybrid Human-Machine Analyses for Characterizing System Failure

As machine learning systems move from computer-science laboratories into...
research
10/29/2021

UDIS: Unsupervised Discovery of Bias in Deep Visual Recognition Models

Deep learning models have been shown to learn spurious correlations from...

Please sign up or login with your details

Forgot password? Click here to reset