User Tampering in Reinforcement Learning Recommender Systems

09/09/2021
by   Charles Evans, et al.
0

This paper provides the first formalisation and empirical demonstration of a particular safety concern in reinforcement learning (RL)-based news and social media recommendation algorithms. This safety concern is what we call "user tampering" – a phenomenon whereby an RL-based recommender system may manipulate a media user's opinions, preferences and beliefs via its recommendations as part of a policy to increase long-term user engagement. We provide a simulation study of a media recommendation problem constrained to the recommendation of political content, and demonstrate that a Q-learning algorithm consistently learns to exploit its opportunities to 'polarise' simulated 'users' with its early recommendations in order to have more consistent success with later recommendations catering to that polarisation. Finally, we argue that given our findings, designing an RL-based recommender system which cannot learn to exploit user tampering requires making the metric for the recommender's success independent of observable signals of user engagement, and thus that a media recommendation system built solely with RL is necessarily either unsafe, or almost certainly commercially unviable.

READ FULL TEXT

page 5

page 6

research
03/20/2022

Explicit User Manipulation in Reinforcement Learning Based Recommender Systems

Recommender systems are highly prevalent in the modern world due to thei...
research
05/04/2020

Reward Constrained Interactive Recommendation with Natural Language Feedback

Text-based interactive recommendation provides richer user feedback and ...
research
12/13/2018

Comparison of Recommender Systems in an Ed-Tech Application

Smile and Learn is an Ed-Tech company that runs a smart library with mor...
research
09/12/2018

The closed loop between opinion formation and personalised recommendations

In social media, recommender systems are responsible for directing the u...
research
05/26/2022

Constrained Reinforcement Learning for Short Video Recommendation

The wide popularity of short videos on social media poses new opportunit...
research
09/10/2023

Representation Learning in Low-rank Slate-based Recommender Systems

Reinforcement learning (RL) in recommendation systems offers the potenti...
research
11/01/2022

Using coevolution and substitution of the fittest for health and well-being recommender systems

This research explores substitution of the fittest (SF), a technique des...

Please sign up or login with your details

Forgot password? Click here to reset