Balanced Multimodal Learning via On-the-fly Gradient Modulation

03/29/2022
by   Xiaokang Peng, et al.
0

Multimodal learning helps to comprehensively understand the world, by integrating different senses. Accordingly, multiple input modalities are expected to boost model performance, but we actually find that they are not fully exploited even when the multimodal model outperforms its uni-modal counterpart. Specifically, in this paper we point out that existing multimodal discriminative models, in which uniform objective is designed for all modalities, could remain under-optimized uni-modal representations, caused by another dominated modality in some scenarios, e.g., sound in blowing wind event, vision in drawing picture event, etc. To alleviate this optimization imbalance, we propose on-the-fly gradient modulation to adaptively control the optimization of each modality, via monitoring the discrepancy of their contribution towards the learning objective. Further, an extra Gaussian noise that changes dynamically is introduced to avoid possible generalization drop caused by gradient modulation. As a result, we achieve considerable improvement over common fusion methods on different multimodal tasks, and this simple strategy can also boost existing multimodal methods, which illustrates its efficacy and versatility. The source code is available at <https://github.com/GeWu-Lab/OGM-GE_CVPR2022>.

READ FULL TEXT
research
11/14/2022

PMR: Prototypical Modal Rebalance for Multimodal Learning

Multimodal learning (MML) aims to jointly exploit the common priors of d...
research
08/21/2023

Deep Metric Loss for Multimodal Learning

Multimodal learning often outperforms its unimodal counterparts by explo...
research
05/10/2023

MMoT: Mixture-of-Modality-Tokens Transformer for Composed Multimodal Conditional Image Synthesis

Existing multimodal conditional image synthesis (MCIS) methods generate ...
research
09/07/2023

Multimodal Transformer for Material Segmentation

Leveraging information across diverse modalities is known to enhance per...
research
02/14/2023

Balanced Audiovisual Dataset for Imbalance Analysis

The imbalance problem is widespread in the field of machine learning, wh...
research
02/02/2023

MMRec: Simplifying Multimodal Recommendation

This paper presents an open-source toolbox, MMRec for multimodal recomme...
research
11/10/2020

Deep Multimodal Fusion by Channel Exchanging

Deep multimodal fusion by using multiple sources of data for classificat...

Please sign up or login with your details

Forgot password? Click here to reset