On Inductive Biases for Machine Learning in Data Constrained Settings

by   Grégoire Mialon, et al.

Learning with limited data is one of the biggest problems of machine learning. Current approaches to this issue consist in learning general representations from huge amounts of data before fine-tuning the model on a small dataset of interest. While such technique, coined transfer learning, is very effective in domains such as computer vision or natural langage processing, it does not yet solve common problems of deep learning such as model interpretability or the overall need for data. This thesis explores a different answer to the problem of learning expressive models in data constrained settings: instead of relying on big datasets to learn neural networks, we will replace some modules by known functions reflecting the structure of the data. Very often, these functions will be drawn from the rich literature of kernel methods. Indeed, many kernels can reflect the underlying structure of the data, thus sparing learning parameters to some extent. Our approach falls under the hood of "inductive biases", which can be defined as hypothesis on the data at hand restricting the space of models to explore during learning. We demonstrate the effectiveness of this approach in the context of sequences, such as sentences in natural language or protein sequences, and graphs, such as molecules. We also highlight the relationship between our work and recent advances in deep learning. Additionally, we study convex machine learning models. Here, rather than proposing new models, we wonder which proportion of the samples in a dataset is really needed to learn a "good" model. More precisely, we study the problem of safe sample screening, i.e, executing simple tests to discard uninformative samples from a dataset even before fitting a machine learning model, without affecting the optimal model. Such techniques can be used to prune datasets or mine for rare samples.


A Review of Deep Transfer Learning and Recent Advancements

A successful deep learning model is dependent on extensive training data...

Fast Adaptation with Linearized Neural Networks

The inductive biases of trained neural networks are difficult to underst...

Lorentz Group Equivariant Autoencoders

There has been significant work recently in developing machine learning ...

Syntactic Inductive Biases for Deep Learning Methods

In this thesis, we try to build a connection between the two schools by ...

The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of Inductive Biases in Machine Learning

No free lunch theorems for supervised learning state that no learner can...

Parameterized Neural Networks for Finance

We discuss and analyze a neural network architecture, that enables learn...

Please sign up or login with your details

Forgot password? Click here to reset