b'Katherine Lee'

research

∙ 09/15/2023

Talkin' 'Bout AI Generation: Copyright and the Generative-AI Supply Chain

"Does generative AI infringe copyright?" is an urgent question. It is al...

0 Katherine Lee, et al. ∙

research

∙ 09/09/2023

Reverse-Engineering Decoding Strategies Given Blackbox Access to a Language Generation System

Neural language models are increasingly deployed into APIs and websites ...

0 Daphne Ippolito, et al. ∙

research

∙ 09/09/2023

MADLAD-400: A Multilingual And Document-Level Large Audited Dataset

We introduce MADLAD-400, a manually audited, general domain 3T token mon...

0 Sneha Kudugunta, et al. ∙

research

∙ 06/26/2023

Are aligned neural networks adversarially aligned?

Large language models are now tuned to align with the goals of their cre...

0 Nicholas Carlini, et al. ∙

research

∙ 05/22/2023

A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, Toxicity

Pretraining is the preliminary and fundamental step in developing capabl...

0 Shayne Longpre, et al. ∙

research

∙ 03/06/2023

Students Parrot Their Teachers: Membership Inference on Model Distillation

Model distillation is frequently proposed as a technique to reduce the p...

0 Matthew Jagielski, et al. ∙

research

∙ 10/31/2022

Preventing Verbatim Memorization in Language Models Gives a False Sense of Privacy

Studying data memorization in neural language models helps us understand...

0 Daphne Ippolito, et al. ∙

research

∙ 04/05/2022

PaLM: Scaling Language Modeling with Pathways

Large language models have been shown to achieve remarkable performance ...

6 Aakanksha Chowdhery, et al. ∙

research

∙ 02/15/2022

Quantifying Memorization Across Neural Language Models

Large language models (LMs) have been shown to memorize parts of their t...

1 Nicholas Carlini, et al. ∙

research

∙ 02/11/2022

What Does it Mean for a Language Model to Preserve Privacy?

Natural language reflects our private lives and identities, making its p...

7 Hannah Brown, et al. ∙

research

∙ 12/24/2021

Counterfactual Memorization in Neural Language Models

Modern neural language models widely used in tasks across NLP risk memor...

25 Chiyuan Zhang, et al. ∙

research

∙ 07/14/2021

Deduplicating Training Data Makes Language Models Better

We find that existing language modeling datasets contain many near-dupli...

0 Katherine Lee, et al. ∙

research

∙ 12/14/2020

Extracting Training Data from Large Language Models

It has become common to publish large (billion parameter) language model...

0 Nicholas Carlini, et al. ∙

research

∙ 04/30/2020

WT5?! Training Text-to-Text Models to Explain their Predictions

Neural networks have recently achieved human-level performance on variou...

0 Sharan Narang, et al. ∙

research

∙ 10/23/2019

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Transfer learning, where a model is first pre-trained on a data-rich tas...

0 Colin Raffel, et al. ∙

Katherine Lee

Featured Co-authors

Sign in with Google

Consider DeepAI Pro