Speech recognition for medical conversations

11/20/2017
by   Chung-Cheng Chiu, et al.
0

In this paper we document our experiences with developing speech recognition for medical transcription - a system that automatically transcribes doctor-patient conversations. Towards this goal, we built a system along two different methodological lines - a Connectionist Temporal Classification (CTC) phoneme based model and a Listen Attend and Spell (LAS) grapheme based model. To train these models we used a corpus of anonymized conversations representing approximately 14,000 hours of speech. Because of noisy transcripts and alignments in the corpus, a significant amount of effort was invested in data cleaning issues. We describe a two-stage strategy we followed for segmenting the data. The data cleanup and development of a matched language model was essential to the success of the CTC based models. The LAS based models, however were found to be resilient to alignment and transcript noise and did not require the use of language models. CTC models were able to achieve a word error rate of 20.1 analysis shows that both models perform well on important medical utterances and therefore can be practical for transcribing medical conversations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/02/2018

Training Neural Speech Recognition Systems with Synthetic Speech Augmentation

Building an accurate automatic speech recognition (ASR) system requires ...
research
12/04/2019

A Resource for Computational Experiments on Mapudungun

We present a resource for computational experiments on Mapudungun, a pol...
research
11/18/2021

Towards Measuring Fairness in Speech Recognition: Casual Conversations Dataset Transcriptions

It is well known that many machine learning systems demonstrate bias tow...
research
02/15/2021

Jira: a Kurdish Speech Recognition System Designing and Building Speech Corpus and Pronunciation Lexicon

In this paper, we introduce the first large vocabulary speech recognitio...
research
05/26/2017

Semi-Supervised Model Training for Unbounded Conversational Speech Recognition

For conversational large-vocabulary continuous speech recognition (LVCSR...
research
07/23/2020

Applying GPGPU to Recurrent Neural Network Language Model based Fast Network Search in the Real-Time LVCSR

Recurrent Neural Network Language Models (RNNLMs) have started to be use...
research
04/30/2018

Automatic Documentation of ICD Codes with Far-Field Speech Recognition

Documentation errors increase healthcare costs and cause unnecessary pat...

Please sign up or login with your details

Forgot password? Click here to reset