GripRank: Bridging the Gap between Retrieval and Generation via the Generative Knowledge Improved Passage Ranking

by   Jiaqi Bai, et al.

Retrieval-enhanced text generation, which aims to leverage passages retrieved from a large passage corpus for delivering a proper answer given the input query, has shown remarkable progress on knowledge-intensive language tasks such as open-domain question answering and knowledge-enhanced dialogue generation. However, the retrieved passages are not ideal for guiding answer generation because of the discrepancy between retrieval and generation, i.e., the candidate passages are all treated equally during the retrieval procedure without considering their potential to generate the proper answers. This discrepancy makes a passage retriever deliver a sub-optimal collection of candidate passages to generate answers. In this paper, we propose the GeneRative Knowledge Improved Passage Ranking (GripRank) approach, addressing the above challenge by distilling knowledge from a generative passage estimator (GPE) to a passage ranker, where the GPE is a generative language model used to measure how likely the candidate passages can generate the proper answer. We realize the distillation procedure by teaching the passage ranker learning to rank the passages ordered by the GPE. Furthermore, we improve the distillation quality by devising a curriculum knowledge distillation mechanism, which allows the knowledge provided by the GPE can be progressively distilled to the ranker through an easy-to-hard curriculum, enabling the passage ranker to correctly recognize the provenance of the answer from many plausible candidates. We conduct extensive experiments on four datasets across three knowledge-intensive language tasks. Experimental results show advantages over the state-of-the-art methods for both passage ranking and answer generation on the KILT benchmark.


Generate rather than Retrieve: Large Language Models are Strong Context Generators

Knowledge-intensive tasks, such as open-domain question answering (QA), ...

KEPR: Knowledge Enhancement and Plausibility Ranking for Generative Commonsense Question Answering

Generative commonsense question answering (GenCQA) is a task of automati...

Metric-guided Distillation: Distilling Knowledge from the Metric to Ranker and Retriever for Generative Commonsense Reasoning

Commonsense generation aims to generate a realistic sentence describing ...

Curriculum Learning for Dense Retrieval Distillation

Recent work has shown that more effective dense retrieval models can be ...

Knowledge Transfer from Answer Ranking to Answer Generation

Recent studies show that Question Answering (QA) based on Answer Sentenc...

KEYword based Sampling (KEYS) for Large Language Models

Question answering (Q/A) can be formulated as a generative task (Mitra, ...

Knowledge-Driven Distractor Generation for Cloze-style Multiple Choice Questions

In this paper, we propose a novel configurable framework to automaticall...

Please sign up or login with your details

Forgot password? Click here to reset