Multi-Objective Optimization for Sparse Deep Neural Network Training

by   S. S. Hotegni, et al.

Different conflicting optimization criteria arise naturally in various Deep Learning scenarios. These can address different main tasks (i.e., in the setting of Multi-Task Learning), but also main and secondary tasks such as loss minimization versus sparsity. The usual approach is a simple weighting of the criteria, which formally only works in the convex setting. In this paper, we present a Multi-Objective Optimization algorithm using a modified Weighted Chebyshev scalarization for training Deep Neural Networks (DNNs) with respect to several tasks. By employing this scalarization technique, the algorithm can identify all optimal solutions of the original problem while reducing its complexity to a sequence of single-objective problems. The simplified problems are then solved using an Augmented Lagrangian method, enabling the use of popular optimization techniques such as Adam and Stochastic Gradient Descent, while efficaciously handling constraints. Our work aims to address the (economical and also ecological) sustainability issue of DNN models, with a particular focus on Deep Multi-Task models, which are typically designed with a very large number of weights to perform equally well on multiple tasks. Through experiments conducted on two Machine Learning datasets, we demonstrate the possibility of adaptively sparsifying the model during training without significantly impacting its performance, if we are willing to apply task-specific adaptations to the network weights. Code is available at


page 8

page 9

page 10


Multi-task problems are not multi-objective

Multi-objective optimization (MOO) aims at finding a set of optimal conf...

Mitigating Gradient Bias in Multi-objective Learning: A Provably Convergent Stochastic Approach

Machine learning problems with multiple objective functions appear eithe...

Direction-oriented Multi-objective Learning: Simple and Provable Stochastic Algorithms

Multi-objective optimization (MOO) has become an influential framework i...

Exact Pareto Optimal Search for Multi-Task Learning: Touring the Pareto Front

Multi-Task Learning (MTL) is a well-established paradigm for training de...

Integrating Random Effects in Deep Neural Networks

Modern approaches to supervised learning like deep neural networks (DNNs...

DiNNO: Distributed Neural Network Optimization for Multi-Robot Collaborative Learning

We present a distributed algorithm that enables a group of robots to col...

Sparsity Meets Robustness: Channel Pruning for the Feynman-Kac Formalism Principled Robust Deep Neural Nets

Deep neural nets (DNNs) compression is crucial for adaptation to mobile ...

Please sign up or login with your details

Forgot password? Click here to reset