Contextualized Generative Retrieval

10/05/2022
by   Hyunji Lee, et al.
0

The text retrieval task is mainly performed in two ways: the bi-encoder approach and the generative approach. The bi-encoder approach maps the document and query embeddings to common vector space and performs a nearest neighbor search. It stably shows high performance and efficiency across different domains but has an embedding space bottleneck as it interacts in L2 or inner product space. The generative retrieval model retrieves by generating a target sequence and overcomes the embedding space bottleneck by interacting in the parametric space. However, it fails to retrieve the information it has not seen during the training process as it depends solely on the information encoded in its own model parameters. To leverage the advantages of both approaches, we propose Contextualized Generative Retrieval model, which uses contextualized embeddings (output embeddings of a language model encoder) as vocab embeddings at the decoding step of generative retrieval. The model uses information encoded in both the non-parametric space of contextualized token embeddings and the parametric space of the generative retrieval model. Our approach of generative retrieval with contextualized vocab embeddings shows higher performance than generative retrieval with only vanilla vocab embeddings in the document retrieval task, an average of 6 and 2X higher in NQ-320k, suggesting the benefits of using contextualized embedding in generative retrieval models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/27/2022

Generative Multi-hop Retrieval

Multi-hop retrieval is the task of retrieving a series of multiple docum...
research
07/16/2021

More Robust Dense Retrieval with Contrastive Dual Learning

Dense retrieval conducts text retrieval in the embedding space and has s...
research
10/24/2018

Text Embeddings for Retrieval From a Large Knowledge Base

Text embedding representing natural language documents in a semantic vec...
research
10/18/2022

CPS-MEBR: Click Feedback-Aware Web Page Summarization for Multi-Embedding-Based Retrieval

Embedding-based retrieval (EBR) is a technique to use embeddings to repr...
research
12/20/2022

Precise Zero-Shot Dense Retrieval without Relevance Labels

While dense retrieval has been shown effective and efficient across task...
research
07/20/2020

Acoustic Neighbor Embeddings

This paper proposes a novel acoustic word embedding called Acoustic Neig...
research
05/21/2023

Retrieving Texts based on Abstract Descriptions

In this work, we aim to connect two research areas: instruction models a...

Please sign up or login with your details

Forgot password? Click here to reset