Data Compression and Inference in Cosmology with Self-Supervised Machine Learning

08/18/2023
by   Aizhan Akhmetzhanova, et al.
0

The influx of massive amounts of data from current and upcoming cosmological surveys necessitates compression schemes that can efficiently summarize the data with minimal loss of information. We introduce a method that leverages the paradigm of self-supervised machine learning in a novel manner to construct representative summaries of massive datasets using simulation-based augmentations. Deploying the method on hydrodynamical cosmological simulations, we show that it can deliver highly informative summaries, which can be used for a variety of downstream tasks, including precise and accurate parameter inference. We demonstrate how this paradigm can be used to construct summary representations that are insensitive to prescribed systematic effects, such as the influence of baryonic physics. Our results indicate that self-supervised machine learning techniques offer a promising new approach for compression of cosmological data as well its analysis.

READ FULL TEXT
research
05/16/2021

Self-supervised on Graphs: Contrastive, Generative,or Predictive

Deep learning on graphs has recently achieved remarkable success on a va...
research
11/07/2022

On minimal variations for unsupervised representation learning

Unsupervised representation learning aims at describing raw data efficie...
research
11/17/2022

Compressing Transformer-based self-supervised models for speech processing

Despite the success of Transformers in self-supervised learning with app...
research
06/22/2020

Don't Wait, Just Weight: Improving Unsupervised Representations by Learning Goal-Driven Instance Weights

In the absence of large labelled datasets, self-supervised learning tech...
research
11/08/2021

Characterizing the adversarial vulnerability of speech self-supervised learning

A leaderboard named Speech processing Universal PERformance Benchmark (S...
research
11/04/2022

Once-for-All Sequence Compression for Self-Supervised Speech Models

The sequence length along the time axis is often the dominant factor of ...
research
04/15/2022

Self-Similarity Priors: Neural Collages as Differentiable Fractal Representations

Many patterns in nature exhibit self-similarity: they can be compactly d...

Please sign up or login with your details

Forgot password? Click here to reset