Explaining the Effectiveness of Multi-Task Learning for Efficient Knowledge Extraction from Spine MRI Reports

05/06/2022
by   Arijit Sehanobish, et al.
10

Pretrained Transformer based models finetuned on domain specific corpora have changed the landscape of NLP. However, training or fine-tuning these models for individual tasks can be time consuming and resource intensive. Thus, a lot of current research is focused on using transformers for multi-task learning (Raffel et al.,2020) and how to group the tasks to help a multi-task model to learn effective representations that can be shared across tasks (Standley et al., 2020; Fifty et al., 2021). In this work, we show that a single multi-tasking model can match the performance of task specific models when the task specific models show similar representations across all of their hidden layers and their gradients are aligned, i.e. their gradients follow the same direction. We hypothesize that the above observations explain the effectiveness of multi-task learning. We validate our observations on our internal radiologist-annotated datasets on the cervical and lumbar spine. Our method is simple and intuitive, and can be used in a wide range of NLP problems.

READ FULL TEXT
research
04/09/2022

Efficient Extraction of Pathologies from C-Spine Radiology Reports using Multi-Task Learning

Pretrained Transformer based models finetuned on domain specific corpora...
research
06/08/2021

Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks

State-of-the-art parameter-efficient fine-tuning methods rely on introdu...
research
04/27/2022

SkillSpan: Hard and Soft Skill Extraction from English Job Postings

Skill Extraction (SE) is an important and widely-studied task useful to ...
research
12/14/2019

Regularizing Deep Multi-Task Networks using Orthogonal Gradients

Deep neural networks are a promising approach towards multi-task learnin...
research
04/15/2022

In-BoXBART: Get Instructions into Biomedical Multi-Task Learning

Single-task models have proven pivotal in solving specific tasks; howeve...
research
03/14/2023

Merging Decision Transformers: Weight Averaging for Forming Multi-Task Policies

Recent work has shown the promise of creating generalist, transformer-ba...
research
11/15/2022

Cross-Stitched Multi-task Dual Recursive Networks for Unified Single Image Deraining and Desnowing

We present the Cross-stitched Multi-task Unified Dual Recursive Network ...

Please sign up or login with your details

Forgot password? Click here to reset