Factorizing Knowledge in Neural Networks

07/04/2022
by   Xingyi Yang, et al.
0

In this paper, we explore a novel and ambitious knowledge-transfer task, termed Knowledge Factorization (KF). The core idea of KF lies in the modularization and assemblability of knowledge: given a pretrained network model as input, KF aims to decompose it into several factor networks, each of which handles only a dedicated task and maintains task-specific knowledge factorized from the source network. Such factor networks are task-wise disentangled and can be directly assembled, without any fine-tuning, to produce the more competent combined-task networks. In other words, the factor networks serve as Lego-brick-like building blocks, allowing us to construct customized networks in a plug-and-play manner. Specifically, each factor network comprises two modules, a common-knowledge module that is task-agnostic and shared by all factor networks, alongside with a task-specific module dedicated to the factor network itself. We introduce an information-theoretic objective, InfoMax-Bottleneck (IMB), to carry out KF by optimizing the mutual information between the learned representations and input. Experiments across various benchmarks demonstrate that, the derived factor networks yield gratifying performances on not only the dedicated tasks but also disentanglement, while enjoying much better interpretability and modularity. Moreover, the learned common-knowledge representations give rise to impressive results on transfer learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/22/2022

Task-guided Disentangled Tuning for Pretrained Language Models

Pretrained language models (PLMs) trained on large-scale unlabeled corpu...
research
05/24/2023

Lightweight Learner for Shared Knowledge Lifelong Learning

In Lifelong Learning (LL), agents continually learn as they encounter ne...
research
05/31/2022

Compressed Hierarchical Representations for Multi-Task Learning and Task Clustering

In this paper, we frame homogeneous-feature multi-task learning (MTL) as...
research
10/18/2022

Hidden State Variability of Pretrained Language Models Can Guide Computation Reduction for Transfer Learning

While transferring a pretrained language model, common approaches conven...
research
05/18/2021

Exploring Driving-aware Salient Object Detection via Knowledge Transfer

Recently, general salient object detection (SOD) has made great progress...
research
06/13/2023

TART: A plug-and-play Transformer module for task-agnostic reasoning

Large language models (LLMs) exhibit in-context learning abilities which...

Please sign up or login with your details

Forgot password? Click here to reset