Audiovisual Database with 360 Video and Higher-Order Ambisonics Audio for Perception, Cognition, Behavior, and QoE Evaluation Research

12/27/2022
by   Thomas Robotham, et al.
0

Research into multi-modal perception, human cognition, behavior, and attention can benefit from high-fidelity content that may recreate real-life-like scenes when rendered on head-mounted displays. Moreover, aspects of audiovisual perception, cognitive processes, and behavior may complement questionnaire-based Quality of Experience (QoE) evaluation of interactive virtual environments. Currently, there is a lack of high-quality open-source audiovisual databases that can be used to evaluate such aspects or systems capable of reproducing high-quality content. With this paper, we provide a publicly available audiovisual database consisting of twelve scenes capturing real-life nature and urban environments with a video resolution of 7680x3840 at 60 frames-per-second and with 4th-order Ambisonics audio. These 360 video sequences, with an average duration of 60 seconds, represent real-life settings for systematically evaluating various dimensions of uni-/multi-modal perception, cognition, behavior, and QoE. The paper provides details of the scene requirements, recording approach, and scene descriptions. The database provides high-quality reference material with a balanced focus on auditory and visual sensory information. The database will be continuously updated with additional scenes and further metadata such as human ratings and saliency information.

READ FULL TEXT

page 1

page 3

page 5

research
09/04/2021

Multi-modal Representation Learning for Video Advertisement Content Structuring

Video advertisement content structuring aims to segment a given video ad...
research
07/09/2020

ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation

We introduce ThreeDWorld (TDW), a platform for interactive multi-modal p...
research
03/22/2023

Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline

Existing audio-visual event localization (AVE) handles manually trimmed ...
research
01/25/2021

Using Angle of Arrival for Improving Indoor Localization

In this paper, we primarily explore the improvement of single stream aud...
research
07/27/2022

Learning to Assess Danger from Movies for Cooperative Escape Planning in Hazardous Environments

There has been a plethora of work towards improving robot perception and...
research
06/15/2021

Is this Harmful? Learning to Predict Harmfulness Ratings from Video

Automatically identifying harmful content in video is an important task ...
research
09/03/2022

Synthesizing Photorealistic Virtual Humans Through Cross-modal Disentanglement

Over the last few decades, many aspects of human life have been enhanced...

Please sign up or login with your details

Forgot password? Click here to reset