Automatic Audio Captioning using Attention weighted Event based Embeddings

01/28/2022
by   Swapnil Bhosale, et al.
0

Automatic Audio Captioning (AAC) refers to the task of translating audio into a natural language that describes the audio events, source of the events and their relationships. The limited samples in AAC datasets at present, has set up a trend to incorporate transfer learning with Audio Event Detection (AED) as a parent task. Towards this direction, in this paper, we propose an encoder-decoder architecture with light-weight (i.e. with lesser learnable parameters) Bi-LSTM recurrent layers for AAC and compare the performance of two state-of-the-art pre-trained AED models as embedding extractors. Our results show that an efficient AED based embedding extractor combined with temporal attention and augmentation techniques is able to surpass existing literature with computationally intensive architectures. Further, we provide evidence of the ability of the non-uniform attention weighted encoding generated as a part of our model to facilitate the decoder glance over specific sections of the audio while generating each token.

READ FULL TEXT

page 2

page 3

research
04/18/2022

Automated Audio Captioning using Audio Event Clues

Audio captioning is an important research area that aims to generate mea...
research
07/21/2021

Audio Captioning Transformer

Audio captioning aims to automatically generate a natural language descr...
research
09/01/2023

CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding

Automated Audio Captioning (AAC) involves generating natural language de...
research
06/02/2023

Enhance Temporal Relations in Audio Captioning with Sound Event Detection

Automated audio captioning aims at generating natural language descripti...
research
06/05/2020

Audio Captioning using Gated Recurrent Units

Audio captioning is a recently proposed task for automatically generatin...
research
10/12/2021

Improving the Performance of Automated Audio Captioning via Integrating the Acoustic and Semantic Information

Automated audio captioning (AAC) has developed rapidly in recent years, ...
research
02/23/2021

Investigating Local and Global Information for Automated Audio Captioning with Transfer Learning

Automated audio captioning (AAC) aims at generating summarizing descript...

Please sign up or login with your details

Forgot password? Click here to reset