Data-driven discovery of core sleep biomarkers for predicting early cardiometabolic risk in a healthy population using machine learning

This article has 0 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

Background

Identifying robust biomarkers for future cardiometabolic risk within the crucial “ preventive window” in healthy individuals remains a major challenge. While numerous sleep metrics are linked to health, their hierarchical importance is unknown. This study aimed to leverage a data-driven machine learning paradigm to move beyond conventional metrics and objectively identify the core sleep-related physiological drivers for predicting the transition to early-stage cardiometabolic risk.

Methods

We conducted a longitudinal analysis on 447 initially healthy participants from the Sleep Heart Health Study (SHHS). A LASSO (L1-regularized) logistic regression model was trained on 16 high-quality clinical and polysomnographic features to perform data-driven biomarker selection, following a rigorous data quality audit where high-missingness variables (e.g., heart rate variability) were excluded. The performance of the final models was rigorously evaluated using 10-repeats of 10-fold cross-validation and compared using paired t-tests.

Findings

LASSO regression identified a parsimonious set of six core predictors. Notably, respiratory disturbance index (RDI) and minimum nocturnal oxygen saturation (min_spo2) emerged as the key biomarkers, superseding traditional sleep fragmentation metrics like the arousal index. In the primary cross-validation analysis, the lean LASSO model demonstrated the strongest predictive performance (mean AUC = 0.698), statistically outperforming a complex model with all 16 features (mean AUC = 0.669, p<0.0001). This superiority and robustness were maintained in high-risk subgroups.

Interpretation

Our data-driven approach reveals that physiological stress directly linked to sleep-disordered breathing and nocturnal hypoxemia, rather than general sleep fragmentation, are the primary drivers of the transition towards early cardiometabolic risk in healthy individuals. This finding provides specific, translatable targets for precision preventive medicine, points towards novel mechanisms for early risk development, and offers a blueprint for developing next-generation screening tools, potentially integrated into wearable technology.

Related articles

Related articles are currently not available for this article.