Empowering Cross-lingual Behavioral Testing of NLP Models with Typological Features

07/11/2023
by   Ester Hlavnova, et al.
0

A challenge towards developing NLP systems for the world's languages is understanding how they generalize to typological differences relevant for real-world applications. To this end, we propose M2C, a morphologically-aware framework for behavioral testing of NLP models. We use M2C to generate tests that probe models' behavior in light of specific linguistic features in 12 typologically diverse languages. We evaluate state-of-the-art language models on the generated tests. While models excel at most tests in English, we highlight generalization failures to specific typological characteristics such as temporal expressions in Swahili and compounding possessives in Finish. Our findings motivate the development of models that address these blind spots.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/26/2022

Metaphors in Pre-Trained Language Models: Probing and Generalization Across Datasets and Languages

Human languages are full of metaphorical expressions. Metaphors help peo...
research
05/08/2020

Beyond Accuracy: Behavioral Testing of NLP models with CheckList

Although measuring held-out accuracy has been the primary approach to ev...
research
09/05/2023

Automating Behavioral Testing in Machine Translation

Behavioral testing in NLP allows fine-grained evaluation of systems by e...
research
05/31/2023

ChatGPT an ENFJ, Bard an ISTJ: Empirical Study on Personalities of Large Language Models

Large Language Models (LLMs) have made remarkable advancements in the fi...
research
08/05/2021

EENLP: Cross-lingual Eastern European NLP Index

This report presents the results of the EENLP project, done as a part of...
research
02/09/2023

Zeno: An Interactive Framework for Behavioral Evaluation of Machine Learning

Machine learning models with high accuracy on test data can still produc...
research
02/21/2023

NLPLego: Assembling Test Generation for Natural Language Processing Applications

The development of modern NLP applications often relies on various bench...

Please sign up or login with your details

Forgot password? Click here to reset