T2Ranking: A large-scale Chinese Benchmark for Passage Ranking

by   Xiaohui Xie, et al.

Passage ranking involves two stages: passage retrieval and passage re-ranking, which are important and challenging topics for both academics and industries in the area of Information Retrieval (IR). However, the commonly-used datasets for passage ranking usually focus on the English language. For non-English scenarios, such as Chinese, the existing datasets are limited in terms of data scale, fine-grained relevance annotation and false negative issues. To address this problem, we introduce T2Ranking, a large-scale Chinese benchmark for passage ranking. T2Ranking comprises more than 300K queries and over 2M unique passages from real-world search engines. Expert annotators are recruited to provide 4-level graded relevance scores (fine-grained) for query-passage pairs instead of binary relevance judgments (coarse-grained). To ease the false negative issues, more passages with higher diversities are considered when performing relevance annotations, especially in the test set, to ensure a more accurate evaluation. Apart from the textual query and passage data, other auxiliary resources are also provided, such as query types and XML files of documents which passages are generated from, to facilitate further studies. To evaluate the dataset, commonly used ranking models are implemented and tested on T2Ranking as baselines. The experimental results show that T2Ranking is challenging and there is still scope for improvement. The full data and all codes are available at https://github.com/THUIR/T2Ranking/


Neural-IR-Explorer: A Content-Focused Tool to Explore Neural Re-Ranking Results

In this paper we look beyond metrics-based evaluation of Information Ret...

DuReader_retrieval: A Large-scale Chinese Benchmark for Passage Retrieval from Web Search Engine

In this paper, we present DuReader_retrieval, a large-scale Chinese data...

A New Benchmark and Approach for Fine-grained Cross-media Retrieval

Cross-media retrieval is to return the results of various media types co...

Integrating Listwise Ranking into Pairwise-based Image-Text Retrieval

Image-Text Retrieval (ITR) is essentially a ranking problem. Given a que...

Scalable Evaluation and Improvement of Document Set Expansion via Neural Positive-Unlabeled Learning

We consider the situation in which a user has collected a small set of d...

Effective and Efficient Query-aware Snippet Extraction for Web Search

Query-aware webpage snippet extraction is widely used in search engines ...

Improving Chinese Spelling Check by Character Pronunciation Prediction: The Effects of Adaptivity and Granularity

Chinese spelling check (CSC) is a fundamental NLP task that detects and ...

Please sign up or login with your details

Forgot password? Click here to reset