Online Learning to Rank with List-level Feedback for Image Filtering
Online learning to rank (OLTR) via implicit feedback has been extensively studied for document retrieval in cases where the feedback is available at the level of individual items. To learn from item-level feedback, the current algorithms require certain assumptions about user behavior. In this paper, we study a more general setup: OLTR with list-level feedback, where the feedback is provided only at the level of an entire ranked list. We propose two methods that allow online learning to rank in this setup. The first method, PGLearn, uses a ranking model to generate policies and optimizes it online using policy gradients. The second method, RegLearn, learns to combine individual document relevance scores by directly predicting the observed list-level feedback through regression. We evaluate the proposed methods on the image filtering task, in which deep neural networks (DNNs) are used to rank images in response to a set of standing queries. We show that PGLearn does not perform well in OLTR with list-level feedback. RegLearn, instead, shows good performance in both online and offline metrics.
READ FULL TEXT