Quantified Sleep: Machine learning techniques for observational n-of-1 studies

05/14/2021
by   Gianluca Truda, et al.
22

This paper applies statistical learning techniques to an observational Quantified-Self (QS) study to build a descriptive model of sleep quality. A total of 472 days of my sleep data was collected with an Oura ring and combined with lifestyle, environmental, and psychological data. Such n-of-1 QS projects pose a number of challenges: heterogeneous data sources; missing values; high dimensionality; dynamic feedback loops; human biases. This paper directly addresses these challenges with an end-to-end QS pipeline that produces robust descriptive models. Sleep quality is one of the most difficult modelling targets in QS research, due to high noise and a large number of weakly-contributing factors. Sleep quality was selected so that approaches from this paper would generalise to most other n-of-1 QS projects. Techniques are presented for combining and engineering features for the different classes of data types, sample frequencies, and schema - including event logs, weather, and geo-spatial data. Statistical analyses for outliers, normality, (auto)correlation, stationarity, and missing data are detailed, along with a proposed method for hierarchical clustering to identify correlated groups of features. The missing data was overcome using a combination of knowledge-based and statistical techniques, including several multivariate imputation algorithms. "Markov unfolding" is presented for collapsing the time series into a collection of independent observations, whilst incorporating historical information. The final model was interpreted in two ways: by inspecting the internal β-parameters, and using the SHAP framework. These two interpretation techniques were combined to produce a list of the 16 most-predictive features, demonstrating that an observational study can greatly narrow down the number of features that need to be considered when designing interventional QS studies.

READ FULL TEXT
research
03/30/2020

Imputation of missing sub-hourly precipitation data in a large sensor network: a machine learning approach

Precipitation data from rain gauges is fundamental across many lines of ...
research
10/15/2019

Collection of Historical Weather Data: Issues with Missing Values

Weather data collected from automated weather stations have become a cru...
research
04/29/2020

Framework for the Treatment And Reporting of Missing data in Observational Studies: The TARMOS framework

Missing data are ubiquitous in medical research. Although there is incre...
research
11/29/2021

Validating CircaCP: a Generic Sleep-Wake Cycle Detection Algorithm

Sleep-wake cycle detection is a key step when extrapolating sleep patter...
research
04/17/2023

Signal Processing Grand Challenge 2023 – e-Prevention: Sleep Behavior as an Indicator of Relapses in Psychotic Patients

This paper presents the approach and results of USC SAIL's submission to...
research
04/09/2019

Time-Series Analysis via Low-Rank Matrix Factorization: Applied to Infant-Sleep Data

We propose a nonparametric model for time series with missing data based...
research
09/30/2022

Modelling and classifying joint trajectories of self-reported mood and pain in a large cohort study

It is well-known that mood and pain interact with each other, however in...

Please sign up or login with your details

Forgot password? Click here to reset