RECAST: Interactive Auditing of Automatic Toxicity Detection Models

01/07/2020
by   Austin P. Wright, et al.
Georgia Institute of Technology
0

As toxic language becomes nearly pervasive online, there has been increasing interest in leveraging the advancements in natural language processing (NLP), from very large transformer models to automatically detecting and removing toxic comments. Despite the fairness concerns, lack of adversarial robustness, and limited prediction explainability for deep learning systems, there is currently little work for auditing these systems and understanding how they work for both developers and users. We present our ongoing work, RECAST, an interactive tool for examining toxicity detection models by visualizing explanations for predictions and providing alternative wordings for detected toxic speech.

READ FULL TEXT
02/08/2021

RECAST: Enabling User Recourse and Interpretability of Toxicity Detection Models with Interactive Visualization

With the widespread use of toxic language online, platforms are increasi...
10/01/2020

A Survey of the State of Explainable AI for Natural Language Processing

Recent years have seen important advances in the quality of state-of-the...
05/30/2023

Explaining Hate Speech Classification with Model Agnostic Methods

There have been remarkable breakthroughs in Machine Learning and Artific...
03/01/2023

ToxVis: Enabling Interpretability of Implicit vs. Explicit Toxicity Detection Models with Interactive Visualization

The rise of hate speech on online platforms has led to an urgent need fo...
02/03/2022

Rethinking Explainability as a Dialogue: A Practitioner's Perspective

As practitioners increasingly deploy machine learning models in critical...
08/03/2023

XNLP: An Interactive Demonstration System for Universal Structured NLP

Structured Natural Language Processing (XNLP) is an important subset of ...

Please sign up or login with your details

Forgot password? Click here to reset