New Reinforcement Learning Using a Chaotic Neural Network for Emergence of "Thinking" - "Exploration" Grows into "Thinking" through Learning -

by   Katsunari Shibata, et al.

Expectation for the emergence of higher functions is getting larger in the framework of end-to-end reinforcement learning using a recurrent neural network. However, the emergence of "thinking" that is a typical higher function is difficult to realize because "thinking" needs non fixed-point, flow-type attractors with both convergence and transition dynamics. Furthermore, in order to introduce "inspiration" or "discovery" in "thinking", not completely random but unexpected transition should be also required. By analogy to "chaotic itinerancy", we have hypothesized that "exploration" grows into "thinking" through learning by forming flow-type attractors on chaotic random-like dynamics. It is expected that if rational dynamics are learned in a chaotic neural network (ChNN), coexistence of rational state transition, inspiration-like state transition and also random-like exploration for unknown situation can be realized. Based on the above idea, we have proposed new reinforcement learning using a ChNN as an actor. The positioning of exploration is completely different from the conventional one. The chaotic dynamics inside the ChNN produces exploration factors by itself. Since external random numbers for stochastic action selection are not used, exploration factors cannot be isolated from the output. Therefore, the learning method is also completely different from the conventional one. At each non-feedback connection, one variable named causality trace takes in and maintains the input through the connection according to the change in its output. Using the trace and TD error, the weight is updated. In this paper, as the result of a recent simple task to see whether the new learning works or not, it is shown that a robot with two wheels and two visual sensors reaches a target while avoiding an obstacle after learning though there are still many rooms for improvement.


page 2

page 3

page 4

page 5


Episodic Exploration for Deep Deterministic Policies: An Application to StarCraft Micromanagement Tasks

We consider scenarios from the real-time strategy game StarCraft as new ...

Exploration by Random Network Distillation

We introduce an exploration bonus for deep reinforcement learning method...

Safe Exploration of State and Action Spaces in Reinforcement Learning

In this paper, we consider the important problem of safe exploration in ...

Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models

Achieving efficient and scalable exploration in complex domains poses a ...

Continual Model-Based Reinforcement Learning with Hypernetworks

Effective planning in model-based reinforcement learning (MBRL) and mode...

Temporal Logic Guided Safe Reinforcement Learning Using Control Barrier Functions

Using reinforcement learning to learn control policies is a challenge wh...

Learning Robot Exploration Strategy with 4D Point-Clouds-like Information as Observations

Being able to explore unknown environments is a requirement for fully au...

Please sign up or login with your details

Forgot password? Click here to reset