Recent advancements in audio generation have been spurred by the evoluti...
Automated audio captioning (AAC) which generates textual descriptions of...
The advancement of audio-language (AL) multimodal learning tasks has bee...
Automated audio captioning is a cross-modal translation task for describ...
This study defines a new evaluation metric for audio tagging tasks to
ov...
Audio captioning is the task of generating captions that describe the co...
Automated audio captioning (AAC) aims to describe the content of an audi...
Recently, there has been increasing interest in building efficient audio...
Few-shot audio event detection is a task that detects the occurrence tim...
Few-shot bioacoustic event detection is a task that detects the occurren...
Automated audio captioning is a cross-modal translation task that aims t...
Audio-text retrieval aims at retrieving a target audio clip or caption f...
In this paper, we introduce the task of language-queried audio source
se...
Acoustic scene classification (ASC) aims to classify an audio clip based...
Audio captioning aims at using natural language to describe the content ...
Audio captioning aims at generating natural language descriptions for au...
Automated audio captioning aims to use natural language to describe the
...
Automated Audio captioning (AAC) is a cross-modal translation task that ...
Audio captioning aims to automatically generate a natural language
descr...