Novel Approaches to Accelerating the Convergence Rate of Markov Decision Process for Search Result Diversification

02/23/2018
by   Feng Liu, et al.
0

Recently, some studies have utilized the Markov Decision Process for diversifying (MDP-DIV) the search results in information retrieval. Though promising performances can be delivered, MDP-DIV suffers from a very slow convergence, which hinders its usability in real applications. In this paper, we aim to promote the performance of MDP-DIV by speeding up the convergence rate without much accuracy sacrifice. The slow convergence is incurred by two main reasons: the large action space and data scarcity. On the one hand, the sequential decision making at each position needs to evaluate the query-document relevance for all the candidate set, which results in a huge searching space for MDP; on the other hand, due to the data scarcity, the agent has to proceed more "trial and error" interactions with the environment. To tackle this problem, we propose MDP-DIV-kNN and MDP-DIV-NTN methods. The MDP-DIV-kNN method adopts a k nearest neighbor strategy, i.e., discarding the k nearest neighbors of the recently-selected action (document), to reduce the diversification searching space. The MDP-DIV-NTN employs a pre-trained diversification neural tensor network (NTN-DIV) as the evaluation model, and combines the results with MDP to produce the final ranking solution. The experiment results demonstrate that the two proposed methods indeed accelerate the convergence rate of the MDP-DIV, which is 3x faster, while the accuracies produced barely degrade, or even are better.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/06/2022

Hitting time for Markov decision process

We define the hitting time for a Markov decision process (MDP). We do no...
research
07/31/2014

MONEYBaRL: Exploiting pitcher decision-making using Reinforcement Learning

This manuscript uses machine learning techniques to exploit baseball pit...
research
06/05/2017

A method for the online construction of the set of states of a Markov Decision Process using Answer Set Programming

Non-stationary domains, that change in unpredicted ways, are a challenge...
research
01/16/2013

PEGASUS: A Policy Search Method for Large MDPs and POMDPs

We propose a new approach to the problem of searching a space of policie...
research
05/10/2021

Fast constraint satisfaction problem and learning-based algorithm for solving Minesweeper

Minesweeper is a popular spatial-based decision-making game that works w...
research
09/09/2011

Integrating Learning from Examples into the Search for Diagnostic Policies

This paper studies the problem of learning diagnostic policies from trai...
research
04/27/2023

Level Assembly as a Markov Decision Process

Many games feature a progression of levels that doesn't adapt to the pla...

Please sign up or login with your details

Forgot password? Click here to reset