Incentive-Aware Recommender Systems in Two-Sided Markets

by   Xiaowu Dai, et al.
berkeley college

Online platforms in the Internet Economy commonly incorporate recommender systems that recommend arms (e.g., products) to agents (e.g., users). In such platforms, a myopic agent has a natural incentive to exploit, by choosing the best product given the current information rather than to explore various alternatives to collect information that will be used for other agents. We propose a novel recommender system that respects agents' incentives and enjoys asymptotically optimal performances expressed by the regret in repeated games. We model such an incentive-aware recommender system as a multi-agent bandit problem in a two-sided market which is equipped with an incentive constraint induced by agents' opportunity costs. If the opportunity costs are known to the principal, we show that there exists an incentive-compatible recommendation policy, which pools recommendations across a genuinely good arm and an unknown arm via a randomized and adaptive approach. On the other hand, if the opportunity costs are unknown to the principal, we propose a policy that randomly pools recommendations across all arms and uses each arm's cumulative loss as feedback for exploration. We show that both policies also satisfy an ex-post fairness criterion, which protects agents from over-exploitation.


page 1

page 2

page 3

page 4


Repeated Principal-Agent Games with Unobserved Agent Rewards and Perfect-Knowledge Agents

Motivated by a number of real-world applications from domains like healt...

Regret, stability, and fairness in matching markets with bandit learners

We consider the two-sided matching market with bandit learners. In the s...

Robust Multi-Agent Multi-Armed Bandits

There has been recent interest in collaborative multi-agent bandits, whe...

Who Pays? Personalization, Bossiness and the Cost of Fairness

Fairness-aware recommender systems that have a provider-side fairness co...

Incentivized Bandit Learning with Self-Reinforcing User Preferences

In this paper, we investigate a new multi-armed bandit (MAB) online lear...

Cost Sharing in Two-Sided Markets

Motivated by the emergence of popular service-based two-sided markets wh...

Recommender system as an exploration coordinator: a bounded O(1) regret algorithm for large platforms

On typical modern platforms, users are only able to try a small fraction...

Please sign up or login with your details

Forgot password? Click here to reset