Accurate acquisition of crowd flow at Points of Interest (POIs) is pivot...
Augmentation and knowledge distillation (KD) are well-established techni...
Previously, Target Speaker Extraction (TSE) has yielded outstanding
perf...
Visual information can serve as an effective cue for target speaker
extr...
The currently most prominent algorithm to train keyword spotting (KWS) m...
Transformers have emerged as a prominent model framework for audio taggi...
Spatio-Temporal prediction plays a critical role in smart city construct...
With the development of sophisticated sensors and large database
technol...
Training a 3D scene understanding model requires complicated human
annot...
Keyword spotting (KWS) is a core human-machine-interaction front-end tas...
The success of deep learning heavily relies on large-scale data with
com...
Robust prediction of citywide traffic flows at different time periods pl...
Air pollution is a crucial issue affecting human health and livelihoods,...
Learning descriptive 3D features is crucial for understanding 3D scenes ...
We study the usability of pre-trained weakly supervised audio tagging (A...
Within the audio research community and the industry, keyword spotting (...
The success of deep learning is usually accompanied by the growth in neu...
A data augmentation module is utilized in contrastive learning to transf...
Data insufficiency problem (i.e., data missing and label scarcity issues...
Large-scale audio tagging datasets inevitably contain imperfect labels, ...
Urban metro flow prediction is of great value for metro operation schedu...
Weather Forecasting is an attractive challengeable task due to its influ...
Accurate forecasting of citywide traffic flow has been playing critical ...
This paper introduces GigaSpeech, an evolving, multi-domain English spee...
This paper introduces a new open-source speech corpus named "speechocean...
In the federated learning setting, multiple clients jointly train a mode...
This paper presents the "Ethiopian" system for the SLT 2021 Children Spe...
Convolutional Neural Networks (CNNs) have been widely adopted in raster-...
It is commonly observed that the data are scattered everywhere and diffi...
Urban spatial-temporal flows prediction is of great importance to traffi...
This paper focuses on two related subtasks of aspect-based sentiment
ana...
Most real-world data are scattered across different companies or governm...
Being able to predict the crowd flows in each and every part of a city,
...
Urban flow monitoring systems play important roles in smart city efforts...
This letter proposes a predictor-corrector method to strike a balance be...
In this paper, we propose an attention-based end-to-end model for
multi-...
In this paper, we propose a sequence-to-sequence model for keyword spott...
Spatio-temporal (ST) data, which represent multiple time series data
cor...
In this paper, we propose an attention-based end-to-end neural approach ...
Speaker adaptation aims to estimate a speaker specific acoustic model fr...
We investigate the use of generative adversarial networks (GANs) in spee...
Recently, there has been an increasing interest in end-to-end speech
rec...
Forecasting the flow of crowds is of great importance to traffic managem...
The rapid growth of emerging information technologies and application
pa...
Forecasting the flow of crowds is of great importance to traffic managem...