Adaptive Operator Selection Based on Dynamic Thompson Sampling for MOEA/D

04/22/2020
by   Lei Sun, et al.
0

In evolutionary computation, different reproduction operators have various search dynamics. To strike a well balance between exploration and exploitation, it is attractive to have an adaptive operator selection (AOS) mechanism that automatically chooses the most appropriate operator on the fly according to the current status. This paper proposes a new AOS mechanism for multi-objective evolutionary algorithm based on decomposition (MOEA/D). More specifically, the AOS is formulated as a multi-armed bandit problem where the dynamic Thompson sampling (DYTS) is applied to adapt the bandit learning model, originally proposed with an assumption of a fixed award distribution, to a non-stationary setup. In particular, each arm of our bandit learning model represents a reproduction operator and is assigned with a prior reward distribution. The parameters of these reward distributions will be progressively updated according to the performance of its performance collected from the evolutionary process. When generating an offspring, an operator is chosen by sampling from those reward distribution according to the DYTS. Experimental results fully demonstrate the effectiveness and competitiveness of our proposed AOS mechanism compared with other four state-of-the-art MOEA/D variants.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/30/2021

Kolmogorov-Smirnov Test-Based Actively-Adaptive Thompson Sampling for Non-Stationary Bandits

We consider the non-stationary multi-armed bandit (MAB) framework and pr...
research
02/22/2019

Multi-Armed Bandit Strategies for Non-Stationary Reward Distributions and Delayed Feedback Processes

A survey is performed of various Multi-Armed Bandit (MAB) strategies in ...
research
09/05/2014

Simulating Non Stationary Operators in Search Algorithms

In this paper, we propose a model for simulating search operators whose ...
research
12/08/2017

On Adaptive Estimation for Dynamic Bernoulli Bandits

The multi-armed bandit (MAB) problem is a classic example of the explora...
research
07/24/2012

VOI-aware MCTS

UCT, a state-of-the art algorithm for Monte Carlo tree search (MCTS) in ...
research
01/31/2017

Learning the distribution with largest mean: two bandit frameworks

Over the past few years, the multi-armed bandit model has become increas...
research
10/24/2020

Adam with Bandit Sampling for Deep Learning

Adam is a widely used optimization method for training deep learning mod...

Please sign up or login with your details

Forgot password? Click here to reset