Xinhao Mei

research

∙ 09/19/2023

FoleyGen: Visually-Guided Audio Generation

Recent advancements in audio generation have been spurred by the evoluti...

0 Xinhao Mei, et al. ∙

research

∙ 05/30/2023

Dual Transformer Decoder based Features Fusion Network for Automated Audio Captioning

Automated audio captioning (AAC) which generates textual descriptions of...

0 Jianyuan Sun, et al. ∙

research

∙ 03/30/2023

WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research

The advancement of audio-language (AL) multimodal learning tasks has bee...

0 Xinhao Mei, et al. ∙

research

∙ 12/05/2022

Towards Generating Diverse Audio Captions via Adversarial Training

Automated audio captioning is a cross-modal translation task for describ...

0 Xinhao Mei, et al. ∙

research

∙ 11/22/2022

Ontology-aware Learning and Evaluation for Audio Tagging

This study defines a new evaluation metric for audio tagging tasks to ov...

0 Haohe Liu, et al. ∙

research

∙ 10/28/2022

Visually-Aware Audio Captioning With Adaptive Audio-Visual Attention

Audio captioning is the task of generating captions that describe the co...

0 Xubo Liu, et al. ∙

research

∙ 10/10/2022

Automated Audio Captioning via Fusion of Low- and High- Dimensional Features

Automated audio captioning (AAC) aims to describe the content of an audi...

0 Jianyuan Sun, et al. ∙

research

∙ 10/03/2022

Simple Pooling Front-ends For Efficient Audio Classification

Recently, there has been increasing interest in building efficient audio...

19 Xubo Liu, et al. ∙

research

∙ 07/21/2022

Surrey System for DCASE 2022 Task 5: Few-shot Bioacoustic Event Detection with Segment-level Metric Learning

Few-shot audio event detection is a task that detects the occurrence tim...

0 Haohe Liu, et al. ∙

research

∙ 07/15/2022

Segment-level Metric Learning for Few-shot Bioacoustic Event Detection

Few-shot bioacoustic event detection is a task that detects the occurren...

10 Haohe Liu, et al. ∙

research

∙ 05/12/2022

Automated Audio Captioning: an Overview of Recent Progress and New Challenges

Automated audio captioning is a cross-modal translation task that aims t...

28 Xinhao Mei, et al. ∙

research

∙ 03/29/2022

On Metric Learning for Audio-Text Cross-Modal Retrieval

Audio-text retrieval aims at retrieving a target audio clip or caption f...

0 Xinhao Mei, et al. ∙

research

∙ 03/28/2022

Separate What You Describe: Language-Queried Audio Source Separation

In this paper, we introduce the task of language-queried audio source se...

4 Xubo Liu, et al. ∙

research

∙ 03/07/2022

Deep Neural Decision Forest for Acoustic Scene Classification

Acoustic scene classification (ASC) aims to classify an audio clip based...

7 Jianyuan Sun, et al. ∙

research

∙ 03/06/2022

Leveraging Pre-trained BERT for Audio Captioning

Audio captioning aims at using natural language to describe the content ...

13 Xubo Liu, et al. ∙

research

∙ 10/13/2021

Diverse Audio Captioning via Adversarial Training

Audio captioning aims at generating natural language descriptions for au...

0 Xinhao Mei, et al. ∙

research

∙ 08/05/2021

An Encoder-Decoder Based Audio Captioning System With Transfer and Reinforcement Learning

Automated audio captioning aims to use natural language to describe the ...

0 Xinhao Mei, et al. ∙

research

∙ 07/21/2021

CL4AC: A Contrastive Loss for Audio Captioning

Automated Audio captioning (AAC) is a cross-modal translation task that ...

0 Xubo Liu, et al. ∙

research

∙ 07/21/2021

Audio Captioning Transformer

Audio captioning aims to automatically generate a natural language descr...

0 Xinhao Mei, et al. ∙

Xinhao Mei

Featured Co-authors

Sign in with Google

Consider DeepAI Pro