The COVID-19 pandemic precipitated an abrupt shift in the educational
la...
Fighting misinformation is a challenging, yet crucial, task. Despite the...
Recommender systems are a ubiquitous feature of online platforms.
Increa...
In this paper, the adoption patterns of Generative Artificial Intelligen...
Inventory management offers unique opportunities for reliably evaluating...
In nonstationary bandit learning problems, the decision-maker must
conti...
We study the problem of optimizing a recommender system for outcomes tha...
Given a dataset on actions and resulting long-term rewards, a direct
est...
Team diversity can be seen as a double-edged sword. It brings additional...
Recruiting participants for software engineering research has been a pri...
We explore a new model of bandit experiments where a potentially
nonstat...
There is considerable anecdotal evidence suggesting that software engine...
COVID-19 has likely been the most disruptive event at a global scale the...
Following the onset of the COVID-19 pandemic and subsequent lockdowns,
s...
The number of companies opting for remote working has been increasing ov...
Scrum teams are the most important drivers to lead an Agile project to i...
We consider a discounted infinite horizon optimal stopping problem. If t...
Following the onset of the COVID-19 pandemic and subsequent lockdowns,
s...
Empirical Standards are natural-language models of a scientific communit...
The COVID-19 pandemic has forced governments worldwide to impose movemen...
Folklore suggests that policy gradient can be more robust to misspecific...
We revisit the finite time analysis of policy gradient methods in the
si...
Quality, architecture, and process are considered the keystones of softw...
This paper studies a recent proposal to use randomized value functions t...
Policy gradients methods are perhaps the most widely used class of
reinf...
This note gives a short, self-contained, proof of a sharp connection bet...
Temporal difference learning (TD) is a simple iterative algorithm used t...
Much of the recent literature on bandit learning focuses on algorithms t...
The expected improvement (EI) algorithm is a popular strategy for inform...
We study the use of randomized value functions to guide deep exploration...
Modern data is messy and high-dimensional, and it is often not clear a p...
Most provably-efficient learning algorithms introduce optimism about
poo...