Causal Decision Transformer for Recommender Systems via Offline Reinforcement Learning

04/17/2023
by   Siyu Wang, et al.
0

Reinforcement learning-based recommender systems have recently gained popularity. However, the design of the reward function, on which the agent relies to optimize its recommendation policy, is often not straightforward. Exploring the causality underlying users' behavior can take the place of the reward function in guiding the agent to capture the dynamic interests of users. Moreover, due to the typical limitations of simulation environments (e.g., data inefficiency), most of the work cannot be broadly applied in large-scale situations. Although some works attempt to convert the offline dataset into a simulator, data inefficiency makes the learning process even slower. Because of the nature of reinforcement learning (i.e., learning by interaction), it cannot collect enough data to train during a single interaction. Furthermore, traditional reinforcement learning algorithms do not have a solid capability like supervised learning methods to learn from offline datasets directly. In this paper, we propose a new model named the causal decision transformer for recommender systems (CDT4Rec). CDT4Rec is an offline reinforcement learning system that can learn from a dataset rather than from online interaction. Moreover, CDT4Rec employs the transformer architecture, which is capable of processing large offline datasets and capturing both short-term and long-term dependencies within the data to estimate the causal relationship between action, state, and reward. To demonstrate the feasibility and superiority of our model, we have conducted experiments on six real-world offline datasets and one online simulator.

READ FULL TEXT
research
08/22/2023

On the Opportunities and Challenges of Offline Reinforcement Learning for Recommender Systems

Reinforcement learning serves as a potent tool for modeling dynamic user...
research
06/15/2022

Rethinking Reinforcement Learning for Recommendation: A Prompt Perspective

Modern recommender systems aim to improve user experience. As reinforcem...
research
08/15/2017

Towards Learning Reward Functions from User Interactions

In the physical world, people have dynamic preferences, e.g., the same s...
research
10/29/2018

Deep Reinforcement Learning based Recommendation with Explicit User-Item Interactions Modeling

Recommendation is crucial in both academia and industry, and various tec...
research
07/26/2023

Integrating Offline Reinforcement Learning with Transformers for Sequential Recommendation

We consider the problem of sequential recommendation, where the current ...
research
11/10/2019

Model-Based Reinforcement Learning with Adversarial Training for Online Recommendation

Reinforcement learning is effective in optimizing policies for recommend...
research
12/22/2022

Local Policy Improvement for Recommender Systems

Recommender systems aim to answer the following question: given the item...

Please sign up or login with your details

Forgot password? Click here to reset