b'Yuhuai Wu'

research

∙ 07/06/2023

Focused Transformer: Contrastive Training for Context Scaling

Large language models have an exceptional capability to incorporate new ...

0 Szymon Tworkowski, et al. ∙

research

∙ 06/27/2023

Length Generalization in Arithmetic Transformers

We examine how transformers cope with two challenges: learning basic int...

0 Samy Jelassi, et al. ∙

research

∙ 06/02/2023

Evaluating Language Models for Mathematics through Interactions

The standard methodology of evaluating large language models (LLMs) base...

0 Katherine M. Collins, et al. ∙

research

∙ 05/24/2023

Lexinvariant Language Models

Token embeddings, a mapping from discrete lexical symbols to continuous ...

0 Qian Huang, et al. ∙

research

∙ 03/08/2023

Magnushammer: A Transformer-based Approach to Premise Selection

Premise selection is a fundamental problem of automated theorem proving....

0 Maciej Mikuła, et al. ∙

research

∙ 11/18/2022

Path Independent Equilibrium Models Can Better Exploit Test-Time Computation

Designing networks capable of attaining better performance with an incre...

0 Cem Anil, et al. ∙

research

∙ 11/16/2022

Holistic Evaluation of Language Models

Language models (LMs) are becoming the foundation for almost all major l...

21 Percy Liang, et al. ∙

research

∙ 10/21/2022

Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs

The formalization of existing mathematical proofs is a notoriously diffi...

0 Albert Q. Jiang, et al. ∙

research

∙ 07/21/2022

Language Model Cascades

Prompted models have demonstrated impressive few-shot learning abilities...

5 David Dohan, et al. ∙

research

∙ 07/11/2022

Exploring Length Generalization in Large Language Models

The ability to extrapolate from short problem instances to longer ones i...

0 Cem Anil, et al. ∙

research

∙ 06/29/2022

Solving Quantitative Reasoning Problems with Language Models

Language models have achieved remarkable performance on a wide range of ...

13 Aitor Lewkowycz, et al. ∙

research

∙ 06/21/2022

Insights into Pre-training via Simpler Synthetic Tasks

Pre-training produces representations that are effective for a wide rang...

0 Yuhuai Wu, et al. ∙

research

∙ 06/01/2022

Fast and Precise: Adjusting Planning Horizon with Adaptive Subgoal Search

Complex reasoning problems contain states that vary in the computational...

0 Michał Zawalski, et al. ∙

research

∙ 05/25/2022

Autoformalization with Large Language Models

Autoformalization is the process of automatically translating from natur...

0 Yuhuai Wu, et al. ∙

research

∙ 05/22/2022

Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers

In theorem proving, the task of selecting useful premises from a large l...

0 Albert Q. Jiang, et al. ∙

research

∙ 03/28/2022

STaR: Bootstrapping Reasoning With Reasoning

Generating step-by-step "chain-of-thought" rationales improves language ...

1 Eric Zelikman, et al. ∙

research

∙ 03/16/2022

Memorizing Transformers

Language models typically need to be trained or finetuned in order to ac...

17 Yuhuai Wu, et al. ∙

research

∙ 03/11/2022

Block-Recurrent Transformers

We introduce the Block-Recurrent Transformer, which applies a transforme...

0 DeLesley Hutchins, et al. ∙

research

∙ 10/26/2021

Hierarchical Transformers Are More Efficient Language Models

Transformer models yield impressive results on many NLP and sequence mod...

46 Piotr Nawrot, et al. ∙

research

∙ 08/27/2021

Learning to Give Checkable Answers with Prover-Verifier Games

Our ability to know when to trust the decisions made by machine learning...

0 Cem Anil, et al. ∙

research

∙ 08/25/2021

Subgoal Search For Complex Reasoning Tasks

Humans excel in solving complex reasoning tasks through a mental process...

5 Konrad Czechowski, et al. ∙

research

∙ 02/24/2021

Nonlinear Invariant Risk Minimization: A Causal Approach

Due to spurious correlations, machine learning systems often fail to gen...

38 Chaochao Lu, et al. ∙

research

∙ 02/11/2021

Proof Artifact Co-training for Theorem Proving with Language Models

Labeled data for imitation learning of theorem proving in large librarie...

19 Jesse Michael Han, et al. ∙

research

∙ 01/15/2021

LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning

While designing inductive bias in neural architectures has been widely s...

58 Yuhuai Wu, et al. ∙

research

∙ 07/08/2020

The Scattering Compositional Learner: Discovering Objects, Attributes, Relationships in Analogical Reasoning

In this work, we focus on an analogical reasoning task that contains ric...

1 Yuhuai Wu, et al. ∙

research

∙ 07/07/2020

Learning Branching Heuristics for Propositional Model Counting

Propositional model counting or #SAT is the problem of computing the num...

8 Pashootan Vaezipoor, et al. ∙

research

∙ 07/06/2020

INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving

In learning-assisted theorem proving, one of the most critical challenge...

0 Yuhuai Wu, et al. ∙

research

∙ 06/13/2020

Modelling High-Level Mathematical Reasoning in Mechanised Declarative Proofs

Mathematical proofs can be mechanised using proof assistants to eliminat...

0 Wenda Li, et al. ∙

research

∙ 06/04/2019

Options as responses: Grounding behavioural hierarchies in multi-agent RL

We propose a novel hierarchical agent architecture for multi-agent reinf...

0 Alexander Sasha Vezhnevets, et al. ∙

research

∙ 03/07/2019

Concurrent Meta Reinforcement Learning

State-of-the-art meta reinforcement learning algorithms typically assume...

14 Emilio Parisotto, et al. ∙

research

∙ 02/12/2019

ACTRCE: Augmenting Experience via Teacher's Advice For Multi-Goal Reinforcement Learning

Sparse reward is one of the most challenging problems in reinforcement l...

6 Harris Chan, et al. ∙

research

∙ 03/06/2018

Understanding Short-Horizon Bias in Stochastic Meta-Optimization

Careful tuning of the learning rate, or even schedules thereof, can be c...

0 Yuhuai Wu, et al. ∙

research

∙ 03/03/2018

Some Considerations on Learning to Explore via Meta-Reinforcement Learning

We consider the problem of exploration in meta reinforcement learning. T...

0 Bradly C. Stadie, et al. ∙

research

∙ 01/17/2018

An Empirical Analysis of Proximal Policy Optimization with Kronecker-factored Natural Gradients

In this technical report, we consider an approach that combines the PPO ...

0 Jiaming Song, et al. ∙

research

∙ 08/17/2017

Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation

In this work, we propose to apply trust region optimization to deep rein...

0 Yuhuai Wu, et al. ∙

research

∙ 03/27/2017

Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference

We propose a simple and general variant of the standard reparameterized ...

0 Geoffrey Roeder, et al. ∙

research

∙ 11/14/2016

On the Quantitative Analysis of Decoder-Based Generative Models

The past several years have seen remarkable progress in generative model...

0 Yuhuai Wu, et al. ∙

research

∙ 05/23/2016

Path-Normalized Optimization of Recurrent Neural Networks with ReLU Activations

We investigate the parameter-space geometry of recurrent neural networks...

0 Behnam Neyshabur, et al. ∙

research

∙ 02/26/2016

Architectural Complexity Measures of Recurrent Neural Networks

In this paper, we systematically analyze the connecting architectures of...

0 Saizheng Zhang, et al. ∙

research

∙ 09/19/2015

STDP as presynaptic activity times rate of change of postsynaptic activity

We introduce a weight update formula that is expressed only in terms of ...

0 Yoshua Bengio, et al. ∙

Yuhuai Wu

Featured Co-authors

Sign in with Google

Consider DeepAI Pro