Efficient Nearest Neighbor Search for Cross-Encoder Models using Matrix Factorization

by   Nishant Yadav, et al.
University of Massachusetts Amherst

Efficient k-nearest neighbor search is a fundamental task, foundational for many problems in NLP. When the similarity is measured by dot-product between dual-encoder vectors or ℓ_2-distance, there already exist many scalable and efficient search methods. But not so when similarity is measured by more accurate and expensive black-box neural similarity models, such as cross-encoders, which jointly encode the query and candidate neighbor. The cross-encoders' high computational cost typically limits their use to reranking candidates retrieved by a cheaper model, such as dual encoder or TF-IDF. However, the accuracy of such a two-stage approach is upper-bounded by the recall of the initial candidate set, and potentially requires additional training to align the auxiliary retrieval model with the cross-encoder model. In this paper, we present an approach that avoids the use of a dual-encoder for retrieval, relying solely on the cross-encoder. Retrieval is made efficient with CUR decomposition, a matrix decomposition approach that approximates all pairwise cross-encoder distances from a small subset of rows and columns of the distance matrix. Indexing items using our approach is computationally cheaper than training an auxiliary dual-encoder model through distillation. Empirically, for k > 10, our approach provides test-time recall-vs-computational cost trade-offs superior to the current widely-used methods that re-rank items retrieved using a dual-encoder or TF-IDF.


page 6

page 17

page 18

page 19

page 20

page 21

page 22

page 23


Adaptive Selection of Anchor Items for CUR-based k-NN search with Cross-Encoders

Cross-encoder models, which jointly encode and score a query-item pair, ...

Improving Dual-Encoder Training through Dynamic Indexes for Negative Mining

Dual encoder models are ubiquitous in modern classification and retrieva...

LoopITR: Combining Dual and Cross Encoder Architectures for Image-Text Retrieval

Dual encoders and cross encoders have been widely used for image-text re...

An Empirical Comparison of FAISS and FENSHSES for Nearest Neighbor Search in Hamming Space

In this paper, we compare the performances of FAISS and FENSHSES on near...

Inference-time Re-ranker Relevance Feedback for Neural Information Retrieval

Neural information retrieval often adopts a retrieve-and-rerank framewor...

Large Dual Encoders Are Generalizable Retrievers

It has been shown that dual encoders trained on one domain often fail to...

AdANNS: A Framework for Adaptive Semantic Search

Web-scale search systems learn an encoder to embed a given query which i...

Please sign up or login with your details

Forgot password? Click here to reset