RTTD-ID: Tracked Captions with Multiple Speakers for Deaf Students

by   Raja Kushalnagar, et al.

Students who are deaf and hard of hearing cannot hear in class and do not have full access to spoken information. They can use accommodations such as captions that display speech as text. However, compared with their hearing peers, the caption accommodations do not provide equal access, because they are focused on reading captions on their tablet and cannot see who is talking. This viewing isolation contributes to student frustration and risk of doing poorly or withdrawing from introductory engineering courses with lab components. It also contributes to their lack of inclusion and sense of belonging. We report on the evaluation of a Real-Time Text Display with Speaker-Identification, which displays the location of a speaker in a group (RTTD-ID). RTTD-ID aims to reduce frustration in identifying and following an active speaker when there are multiple speakers, e.g., in a lab. It has three different display schemes to identify the location of the active speaker, which helps deaf students in viewing both the speaker's words and the speaker's expression and actions. We evaluated three RTTD speaker identification methods: 1) traditional: captions stay in one place and viewers search for the speaker, 2) pointer: captions stay in one place, and a pointer to the speaker is displayed, and 3) pop-up: captions "pop-up" next to the speaker. We gathered both quantitative and qualitative information through evaluations with deaf and hard of hearing users. The users preferred the pointer identification method over the traditional and pop-up methods.


page 2

page 5

page 6

page 8


Speaker-independent machine lip-reading with speaker-dependent viseme classifiers

In machine lip-reading, which is identification of speech from visual-on...

A Real-time Speaker Diarization System Based on Spatial Spectrum

In this paper we describe a speaker diarization system that enables loca...

Speaker Diarization and Identification from Single-Channel Classroom Audio Recording Using Virtual Microphones

Speaker identification in noisy audio recordings, specifically those fro...

Identify Speakers in Cocktail Parties with End-to-End Attention

In scenarios where multiple speakers talk at the same time, it is import...

Speaker-specific Thresholding for Robust Imposter Identification in Unseen Speaker Recognition

Speaker identification systems are deployed in diverse environments, oft...

See-Through Captions: Real-Time Captioning on Transparent Display for Deaf and Hard-of-Hearing People

Real-time captioning is a useful technique for deaf and hard-of-hearing ...

Trainable Referring Expression Generation using Overspecification Preferences

Referring expression generation (REG) models that use speaker-dependent ...

Please sign up or login with your details

Forgot password? Click here to reset