A Multi-Armed Bandit-based Approach to Mobile Network Provider Selection

by   Thomas Sandholm, et al.

We argue for giving users the ability to lease bandwidth temporarily from any mobile network operator. We propose, prototype, and evaluate a spectrum market for mobile network access, where multiple network operators offer blocks of bandwidth at specified prices for short-term leases to users, with autonomous agents on user devices making purchase decisions by trading off price, performance, and budget constraints. We show that the problem of provider selection can be formulated as a so-called Bandit problem. For the case where providers change prices synchronously, we approach the problem through contextual multi-armed bandits and Reinforcement Learning methods like Q-learning either applied directly to the bandit maximization problem or indirectly to approximate the Gittins indices that are known to yield the optimal provider selection policy. For a simulated scenario corresponding to a practical use case, our agent shows a 20-41% QoE improvement over random provider selection under various demand, price and mobility conditions. We implemented a prototype spectrum market using LTE networks and eSIM techology and deployed it on a testbed, using a blockchain to implement the ledger where bandwidth purchase transactions are recorded. Experiments showed that we can learn both user behavior and network performance efficiently, and recorded 25-74% improvements in QoE under various competing agent scenarios.


page 5

page 15


Practical Calculation of Gittins Indices for Multi-armed Bandits

Gittins indices provide an optimal solution to the classical multi-armed...

Shrewd Selection Speeds Surfing: Use Smart EXP3!

In this paper, we explore the use of multi-armed bandit online learning ...

Partial Bandit and Semi-Bandit: Making the Most Out of Scarce Users' Feedback

Recent works on Multi-Armed Bandits (MAB) and Combinatorial Multi-Armed ...

Top-K Ranking Deep Contextual Bandits for Information Selection Systems

In today's technology environment, information is abundant, dynamic, and...

Improving Fairness in Adaptive Social Exergames via Shapley Bandits

Algorithmic fairness is an essential requirement as AI becomes integrate...

Competing Bandits: The Perils of Exploration under Competition

We empirically study the interplay between exploration and competition. ...

Profit and Strategic Analysis for MNO-MVNO Partnership

We consider a mobile market driven by two Mobile Network Operators (MNOs...

Please sign up or login with your details

Forgot password? Click here to reset