Deep reinforcement learning applied to an assembly sequence planning problem with user preferences

by   Miguel Neves, et al.

Deep reinforcement learning (DRL) has demonstrated its potential in solving complex manufacturing decision-making problems, especially in a context where the system learns over time with actual operation in the absence of training data. One interesting and challenging application for such methods is the assembly sequence planning (ASP) problem. In this paper, we propose an approach to the implementation of DRL methods in ASP. The proposed approach introduces in the RL environment parametric actions to improve training time and sample efficiency and uses two different reward signals: (1) user's preferences and (2) total assembly time duration. The user's preferences signal addresses the difficulties and non-ergonomic properties of the assembly faced by the human and the total assembly time signal enforces the optimization of the assembly. Three of the most powerful deep RL methods were studied, Advantage Actor-Critic (A2C), Deep Q-Learning (DQN), and Rainbow, in two different scenarios: a stochastic and a deterministic one. Finally, the performance of the DRL algorithms was compared to tabular Q-Learnings performance. After 10,000 episodes, the system achieved near optimal behaviour for the algorithms tabular Q-Learning, A2C, and Rainbow. Though, for more complex scenarios, the algorithm tabular Q-Learning is expected to underperform in comparison to the other 2 algorithms. The results support the potential for the application of deep reinforcement learning in assembly sequence planning problems with human interaction.


page 1

page 2

page 3

page 4


A study on a Q-Learning algorithm application to a manufacturing assembly problem

The development of machine learning algorithms has been gathering releva...

Flexible Gear Assembly With Visual Servoing and Force Feedback

Gear assembly is an essential but challenging task in industrial automat...

Towards a reinforcement learning de novo genome assembler

The use of reinforcement learning has proven to be very promising for so...

Planning Assembly Sequence with Graph Transformer

Assembly sequence planning (ASP) is the essential process for modern man...

Two-Stage Clustering of Human Preferences for Action Prediction in Assembly Tasks

To effectively assist human workers in assembly tasks a robot must proac...

Robust Multi-Modal Policies for Industrial Assembly via Reinforcement Learning and Demonstrations: A Large-Scale Study

Over the past several years there has been a considerable research inves...

Hierarchical Training of Deep Ensemble Policies for Reinforcement Learning in Continuous Spaces

Many actor-critic deep reinforcement learning (DRL) algorithms have achi...

Please sign up or login with your details

Forgot password? Click here to reset