Toward Automated Classroom Observation: Multimodal Machine Learning to Estimate CLASS Positive Climate and Negative Climate

by   Anand Ramakrishnan, et al.

In this work we present a multi-modal machine learning-based system, which we call ACORN, to analyze videos of school classrooms for the Positive Climate (PC) and Negative Climate (NC) dimensions of the CLASS observation protocol that is widely used in educational research. ACORN uses convolutional neural networks to analyze spectral audio features, the faces of teachers and students, and the pixels of each image frame, and then integrates this information over time using Temporal Convolutional Networks. The audiovisual ACORN's PC and NC predictions have Pearson correlations of 0.55 and 0.63 with ground-truth scores provided by expert CLASS coders on the UVA Toddler dataset (cross-validation on n=300 15-min video segments), and a purely auditory ACORN predicts PC and NC with correlations of 0.36 and 0.41 on the MET dataset (test set of n=2000 videos segments). These numbers are similar to inter-coder reliability of human coders. Finally, using Graph Convolutional Networks we make early strides (AUC=0.70) toward predicting the specific moments (45-90sec clips) when the PC is particularly weak/strong. Our findings inform the design of automatic classroom observation and also more general video activity recognition and summary recognition systems.


Classification of Important Segments in Educational Videos using Multimodal Features

Videos are a commonly-used type of content in learning during Web search...

Prediction of Deep Ice Layer Thickness Using Adaptive Recurrent Graph Neural Networks

As we deal with the effects of climate change and the increase of global...

Anomaly Recognition from surveillance videos using 3D Convolutional Neural Networks

Anomalous activity recognition deals with identifying the patterns and e...

Predicting Semen Motility using three-dimensional Convolutional Neural Networks

Manual and computer aided methods to perform semen analysis are time-con...

Discovering Behavioral Predispositions in Data to Improve Human Activity Recognition

The automatic, sensor-based assessment of challenging behavior of person...

The Importance of the Instantaneous Phase in Detecting Faces with Convolutional Neural Networks

Convolutional Neural Networks (CNN) have provided new and accurate metho...

An analysis of observation length requirements for machine understanding of human behaviors in spoken language

Machine learning-based human behavior modeling, often at the level of ch...

Please sign up or login with your details

Forgot password? Click here to reset