Shaping representations through communication: community size effect in artificial learning systems

by   Olivier Tieleman, et al.

Motivated by theories of language and communication that explain why communities with large numbers of speakers have, on average, simpler languages with more regularity, we cast the representation learning problem in terms of learning to communicate. Our starting point sees the traditional autoencoder setup as a single encoder with a fixed decoder partner that must learn to communicate. Generalizing from there, we introduce community-based autoencoders in which multiple encoders and decoders collectively learn representations by being randomly paired up on successive training iterations. We find that increasing community sizes reduce idiosyncrasies in the learned codes, resulting in representations that better encode concept categories and correlate with human feature norms.


page 1

page 2

page 3

page 4


Learning to Improve Representations by Communicating About Perspectives

Effective latent representations need to capture abstract features of th...

Sentence Bottleneck Autoencoders from Transformer Language Models

Representation learning for text via pretraining a language model on a l...

Multimodal Masked Autoencoders Learn Transferable Representations

Building scalable models to learn from diverse, multimodal data remains ...

Speech Representations and Phoneme Classification for Preserving the Endangered Language of Ladin

A vast majority of the world's 7,000 spoken languages are predicted to b...

Distributed Learning Systems with First-order Methods

Scalable and efficient distributed learning is one of the main driving f...

Please sign up or login with your details

Forgot password? Click here to reset