Do Transformers Encode a Foundational Ontology? Probing Abstract Classes in Natural Language

01/25/2022
by   Mael Jullien, et al.
10

With the methodological support of probing (or diagnostic classification), recent studies have demonstrated that Transformers encode syntactic and semantic information to some extent. Following this line of research, this paper aims at taking semantic probing to an abstraction extreme with the goal of answering the following research question: can contemporary Transformer-based models reflect an underlying Foundational Ontology? To this end, we present a systematic Foundational Ontology (FO) probing methodology to investigate whether Transformers-based models encode abstract semantic information. Following different pre-training and fine-tuning regimes, we present an extensive evaluation of a diverse set of large-scale language models over three distinct and complementary FO tagging experiments. Specifically, we present and discuss the following conclusions: (1) The probing results indicate that Transformer-based models incidentally encode information related to Foundational Ontologies during the pre-training pro-cess; (2) Robust FO taggers (accuracy of 90 percent)can be efficiently built leveraging on this knowledge.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/02/2022

Paragraph-based Transformer Pre-training for Multi-Sentence Inference

Inference tasks such as answer sentence selection (AS2) or fact verifica...
research
11/17/2022

On the Effect of Pre-training for Transformer in Different Modality on Offline Reinforcement Learning

We empirically investigate how pre-training on data of different modalit...
research
12/14/2022

Towards Linguistically Informed Multi-Objective Pre-Training for Natural Language Inference

We introduce a linguistically enhanced combination of pre-training metho...
research
05/24/2023

Context-Aware Transformer Pre-Training for Answer Sentence Selection

Answer Sentence Selection (AS2) is a core component for building an accu...
research
11/08/2022

SocioProbe: What, When, and Where Language Models Learn about Sociodemographics

Pre-trained language models (PLMs) have outperformed other NLP models on...
research
02/04/2022

Transformers and the representation of biomedical background knowledge

BioBERT and BioMegatron are Transformers models adapted for the biomedic...
research
01/20/2023

Ontology Pre-training for Poison Prediction

Integrating human knowledge into neural networks has the potential to im...

Please sign up or login with your details

Forgot password? Click here to reset