research
∙
06/12/2023
NF4 Isn't Information Theoretically Optimal (and that's Good)
This note shares some simple calculations and experiments related to abs...
research
∙
12/16/2021
Reconsidering the Past: Optimizing Hidden States in Language Models
We present Hidden-State Optimization (HSO), a gradient-based method for ...
research
∙
08/16/2020