A Multi-Agent Reinforcement Learning Method for Impression Allocation in Online Display Advertising

by   Di Wu, et al.

In online display advertising, guaranteed contracts and real-time bidding (RTB) are two major ways to sell impressions for a publisher. Despite the increasing popularity of RTB, there is still half of online display advertising revenue generated from guaranteed contracts. Therefore, simultaneously selling impressions through both guaranteed contracts and RTB is a straightforward choice for a publisher to maximize its yield. However, deriving the optimal strategy to allocate impressions is not a trivial task, especially when the environment is unstable in real-world applications. In this paper, we formulate the impression allocation problem as an auction problem where each contract can submit virtual bids for individual impressions. With this formulation, we derive the optimal impression allocation strategy by solving the optimal bidding functions for contracts. Since the bids from contracts are decided by the publisher, we propose a multi-agent reinforcement learning (MARL) approach to derive cooperative policies for the publisher to maximize its yield in an unstable environment. The proposed approach also resolves the common challenges in MARL such as input dimension explosion, reward credit assignment, and non-stationary environment. Experimental evaluations on large-scale real datasets demonstrate the effectiveness of our approach.


Impression Allocation and Policy Search in Display Advertising

In online display advertising, guaranteed contracts and real-time biddin...

Combining guaranteed and spot markets in display advertising: selling guaranteed page views with stochastic demand

This paper proposes an optimal dynamic model for combining guaranteed an...

Real-Time Bidding with Multi-Agent Reinforcement Learning in Display Advertising

Real-time advertising allows advertisers to bid for each impression for ...

Budget Constrained Bidding by Model-free Reinforcement Learning in Display Advertising

Real-time bidding (RTB) is almost the most important mechanism in online...

LADDER: A Human-Level Bidding Agent for Large-Scale Real-Time Online Auctions

We present LADDER, the first deep reinforcement learning agent that can ...

Efficient Delivery Policy to Minimize User Traffic Consumption in Guaranteed Advertising

In this work, we study the guaranteed delivery model which is widely use...

Online Allocation and Display Ads Optimization with Surplus Supply

In this work, we study a scenario where a publisher seeks to maximize it...

Please sign up or login with your details

Forgot password? Click here to reset