Satisfiability Modulo Counting (SMC) encompasses problems that require b...
Symbolic regression, as one of the most crucial tasks in AI for science,...
Reinforcement Learning (RL) methods are typically sample-inefficient, ma...
Theoretical guarantees in reinforcement learning (RL) are known to suffe...
Accurately annotated ultrasonic images are vital components of a high-qu...
Software specifications are essential for ensuring the reliability of
so...
Decompilation aims to recover the source code form of a binary executabl...
Security vulnerability repair is a difficult task that is in dire need o...
Learning symbolic expressions directly from experiment data is a vital s...
In recent years, large language models (LMs) have achieved remarkable
pr...
Explanation is a key component for the adoption of reinforcement learnin...
Ultra-dense networks are widely regarded as a promising solution to
expl...
We propose a novel model-based offline Reinforcement Learning (RL) frame...
Automated program repair (APR) aims to help developers improve software
...
We study offline multi-agent reinforcement learning (RL) in Markov games...
MDPs with low-rank transitions – that is, the transition matrix can be
f...
Automated Program Repair (APR) improves software reliability by generati...
Type-based multiple access (TBMA) is a semantics-aware multiple access
p...
Time-critical control applications typically pose stringent connectivity...
Generative models for learning combinatorial structures have transformat...
We propose a new model-based offline RL framework, called Adversarial Mo...
Off-policy evaluation often refers to two related tasks: estimating the
...
Coverage conditions – which assert that the data logging distribution
ad...
We propose two unconditionally stable, linear ensemble algorithms with
p...
We consider text retrieval within dense representational space in real-w...
We study off-policy evaluation (OPE) for partially observable MDPs (POMD...
The current paper studies sample-efficient Reinforcement Learning (RL) i...
We study reward-free reinforcement learning (RL) under general non-linea...
Consider the problem setting of Interaction-Grounded Learning (IGL), in ...
We propose a new learning framework that captures the tiered structure o...
We consider a challenging theoretical problem in offline reinforcement
l...
We present a GAN Transformer framework for general action-conditioned 3D...
Deployment efficiency is an important criterion for many real-world
appl...
Sample-efficiency guarantees for offline reinforcement learning (RL) oft...
We propose Adversarially Trained Actor Critic (ATAC), a new model-free
a...
We consider off-policy evaluation (OPE) in Partially Observable Markov
D...
How to select between policies and value functions produced by different...
Many popular machine learning techniques in natural language processing ...
We consider off-policy evaluation (OPE) in Partially Observable Markov
D...
Various legacy and emerging industrial control applications create the
r...
Hydrodynamics coupled phase field models have intricate difficulties to ...
Recent applications in large-scale wireless mesh networks (WSN), e.g.,
A...
Unsupervised person re-identification (re-ID) remains a challenging task...
The use of pessimism, when reasoning about datasets lacking exhaustive
e...
Recent theoretical work studies sample-efficient reinforcement learning ...
In this paper, we study the convergence properties of off-policy policy
...
The IEEE 802.1 time-sensitive networking (TSN) standards aim at improvin...
We present a second order ensemble method based on a blended three-step ...
We present a novel off-policy loss function for learning a transition mo...
Automatic program repair (APR) is crucial to improve software reliabilit...