Hisashi Kawai

Chat Image Generator Video Music Voice Chat Photo Editor

Featured Co-authors

Yu Tsao
127 publications
Sheng Li
85 publications
Hiroshi Saruwatari
76 publications
Tomoki Toda
66 publications
Shinnosuke Takamichi
50 publications
Tomoki Hayashi
38 publications
Yi-Chiao Wu
29 publications
Szu-Wei Fu
26 publications
Xugang Lu
24 publications
Takayoshi Yamashita
23 publications
Komei Sugiura
21 publications

research

∙ 07/29/2022

Pronunciation-aware unique character encoding for RNN Transducer-based Mandarin speech recognition

For Mandarin end-to-end (E2E) automatic speech recognition (ASR) tasks, ...

0 Peng Shen, et al. ∙

research

∙ 04/22/2022

Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation

This paper presents a speaking-rate-controllable HiFi-GAN neural vocoder...

0 Detai Xin, et al. ∙

research

∙ 04/08/2022

Transducer-based language embedding for spoken language identification

The acoustic and linguistic features are important cues for the spoken l...

0 Peng Shen, et al. ∙

research

∙ 03/31/2022

Partial Coupling of Optimal Transport for Spoken Language Identification

In order to reduce domain discrepancy to improve the performance of cros...

0 Xugang Lu, et al. ∙

research

∙ 04/07/2021

Siamese Neural Network with Joint Bayesian Model Structure for Speaker Verification

Generative probability models are widely used for speaker verification (...

0 Xugang Lu, et al. ∙

research

∙ 03/01/2021

CrossMap Transformer: A Crossmodal Masked Path Transformer Using Double Back-Translation for Vision-and-Language Navigation

Navigation guided by natural language instructions is particularly suita...

0 Aly Magassouba, et al. ∙

research

∙ 02/12/2021

Predicting and Attending to Damaging Collisions for Placing Everyday Objects in Photo-Realistic Simulations

Placing objects is a fundamental task for domestic service robots (DSRs)...

11 Aly Magassouba, et al. ∙

research

∙ 01/09/2021

Integrating a joint Bayesian generative model in a discriminative learning framework for speaker verification

The task for speaker verification (SV) is to decide an utterance is spok...

0 Xugang Lu, et al. ∙

research

∙ 12/24/2020

Unsupervised neural adaptation model based on optimal transport for spoken language identification

Due to the mismatch of statistical distributions of acoustic speech betw...

0 Xugang Lu, et al. ∙

research

∙ 07/25/2020

Quasi-Periodic Parallel WaveGAN: A Non-autoregressive Raw Waveform Generative Model with Pitch-dependent Dilated Convolution Neural Network

In this paper, we propose a quasi-periodic parallel WaveGAN (QPPWG) wave...

0 Yi-Chiao Wu, et al. ∙

research

∙ 07/09/2020

Alleviating the Burden of Labeling: Sentence Generation by Attention Branch Encoder-Decoder Network

Domestic service robots (DSRs) are a promising solution to the shortage ...

0 Tadashi Ogura, et al. ∙

research

∙ 05/18/2020

Quasi-Periodic Parallel WaveGAN Vocoder: A Non-autoregressive Pitch-dependent Dilated Convolution Model for Parametric Speech Generation

In this paper, we propose a parallel WaveGAN (PWG)-like neural vocoder w...

0 Yi-Chiao Wu, et al. ∙

research

∙ 12/27/2019

Deep progressive multi-scale attention for acoustic event classification

Convolutional neural network (CNN) is an indispensable building block fo...

0 Xugang Lu, et al. ∙

research

∙ 12/23/2019

A Multimodal Target-Source Classifier with Attention Branches to Understand Ambiguous Instructions for Fetching Daily Objects

In this study, we focus on multimodal language understanding for fetchin...

2 Aly Magassouba, et al. ∙

research

∙ 09/10/2019

Multimodal Attention Branch Network for Perspective-Free Sentence Generation

In this paper, we address the automatic sentence generation of fetching ...

0 Aly Magassouba, et al. ∙

research

∙ 06/17/2019

Understanding Natural Language Instructions for Fetching Daily Objects Using GAN-Based Multimodal Target-Source Classification

In this paper, we address multimodal language understanding for unconstr...

0 Aly Magassouba, et al. ∙

research

∙ 04/30/2019

Incorporating Symbolic Sequential Modeling for Speech Enhancement

In a noisy environment, a lossy speech signal can be automatically resto...

0 Chien-Feng Liao, et al. ∙

research

∙ 06/11/2018

A Multimodal Classifier Generative Adversarial Network for Carry and Place Tasks from Ambiguous Language Instructions

This paper focuses on a multimodal language understanding method for car...

0 Aly Magassouba, et al. ∙

research

∙ 01/16/2018

Grounded Language Understanding for Manipulation Instructions Using GAN-Based Classification

The target task of this study is grounded language understanding for dom...

0 Komei Sugiura, et al. ∙

research

∙ 09/12/2017

End-to-End Waveform Utterance Enhancement for Direct Evaluation Metrics Optimization by Fully Convolutional Neural Networks

Speech enhancement model is used to map a noisy speech to a clean speech...

0 Szu-Wei Fu, et al. ∙

research

∙ 03/07/2017

Raw Waveform-based Speech Enhancement by Fully Convolutional Networks

This study proposes a fully convolutional network (FCN) model for raw wa...

0 Szu-Wei Fu, et al. ∙

Success!

An error occurred

Hisashi Kawai

Featured Co-authors

Sign in with Google

Consider DeepAI Pro