Noise-Robust Dense Retrieval via Contrastive Alignment Post Training

04/06/2023
by   Daniel Campos, et al.
0

The success of contextual word representations and advances in neural information retrieval have made dense vector-based retrieval a standard approach for passage and document ranking. While effective and efficient, dual-encoders are brittle to variations in query distributions and noisy queries. Data augmentation can make models more robust but introduces overhead to training set generation and requires retraining and index regeneration. We present Contrastive Alignment POst Training (CAPOT), a highly efficient finetuning method that improves model robustness without requiring index regeneration, the training set optimization, or alteration. CAPOT enables robust retrieval by freezing the document encoder while the query encoder learns to align noisy queries with their unaltered root. We evaluate CAPOT noisy variants of MSMARCO, Natural Questions, and Trivia QA passage retrieval, finding CAPOT has a similar impact as data augmentation with none of its overhead.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/31/2023

Quick Dense Retrievers Consume KALE: Post Training Kullback Leibler Alignment of Embeddings for Asymmetrical dual encoders

In this paper, we consider the problem of improving the inference latenc...
research
07/16/2021

More Robust Dense Retrieval with Contrastive Dual Learning

Dense retrieval conducts text retrieval in the embedding space and has s...
research
05/04/2022

Analysing the Robustness of Dual Encoders for Dense Retrieval Against Misspellings

Dense retrieval is becoming one of the standard approaches for document ...
research
06/05/2023

SamToNe: Improving Contrastive Loss for Dual Encoder Retrieval Models with Same Tower Negatives

Dual encoders have been used for retrieval tasks and representation lear...
research
05/25/2022

Refining Query Representations for Dense Retrieval at Test Time

Dense retrieval uses a contrastive learning framework to learn dense rep...
research
06/17/2023

Typo-Robust Representation Learning for Dense Retrieval

Dense retrieval is a basic building block of information retrieval appli...
research
07/07/2022

Supervised Contrastive Learning Approach for Contextual Ranking

Contextual ranking models have delivered impressive performance improvem...

Please sign up or login with your details

Forgot password? Click here to reset