We initiate the study of how to perturb the reward in a zero-sum Markov ...
In this paper, we develop an approximation scheme for solving bilevel
pr...
Multi-agent path finding (MAPF) is the problem of moving agents to the g...
With the attention mechanism, transformers achieve significant empirical...
The cooperative Multi-A gent R einforcement Learning (MARL) with permuta...
A Stackelberg congestion game (SCG) is a bilevel program in which a lead...
Robots are integrating more huge-size models to enrich functions and imp...
Robot Operating System (ROS) has brought the excellent potential for
aut...
Recent years have witnessed a rapid growth of applying deep spatiotempor...
To regulate a social system comprised of self-interested agents, economi...
Federated learning plays an important role in the process of smart citie...
A technological revolution is occurring in the field of robotics with th...
In recent years, individuals, business organizations or the country have...
With the spread and development of new epidemics, it is of great referen...
The agricultural irrigation system is closely related to agricultural
pr...
Humans are capable of learning a new behavior by observing others to per...
Humans are capable of learning a new behavior by observing others perfor...
Traffic flow forecasting is hot spot research of intelligent traffic sys...
Proximal policy optimization and trust region policy optimization (PPO a...
The Pyralidae pests, such as corn borer and rice leaf roller, are main p...
This paper was motivated by the problem of how to make robots fuse and
t...
When learning from a batch of logged bandit feedback, the discrepancy be...