We proposed Audio Difference Captioning (ADC) as a new extension task of...
Self-supervised learning general-purpose audio representations have
demo...
This paper provides a baseline system for First-shot-compliant unsupervi...
Masked Autoencoders is a simple yet powerful self-supervised learning me...
We propose a novel framework for target speech extraction based on seman...
The amount of audio data available on public websites is growing rapidly...
Many application studies rely on audio DNN models pre-trained on a
large...
Recent general-purpose audio representations show state-of-the-art
perfo...
Pre-trained models are essential as feature extractors in modern machine...
In many situations, we would like to hear desired sound events (SEs) whi...
We tackle a challenging task: multi-view and multi-modal event detection...
Our goal is to develop a sound event localization and detection (SELD) s...
Inspired by the recent progress in self-supervised learning for computer...
The goal of audio captioning is to translate input audio into its descri...
The system we used for Task 6 (Automated Audio Captioning)of the Detecti...
This technical report describes the system participating to the Detectio...
Humans are able to imagine a person's voice from the person's appearance...