Validation of a Bayesian Learning Model to Predict the Risk for Cannabis Use Disorder
Background: Cannabis use disorder (CUD) is a growing public health problem. Early identification of adolescents and young adults at risk of developing CUD in the future may help stem this trend. A logistic regression model fitted using a Bayesian learning approach was developed recently to predict the risk of future CUD based on seven risk factors in adolescence and youth. A nationally representative longitudinal dataset, Add Health was used to train the model (henceforth referred as Add Health model). Methods: We validated the Add Health model on two cohorts, namely, Michigan Longitudinal Study (MLS) and Christchurch Health and Development Study (CHDS) using longitudinal data from participants until they were approximately 30 years old (to be consistent with the training data from Add Health). If a participant was diagnosed with CUD at any age during this period, they were considered a case. We calculated the area under the curve (AUC) and the ratio of expected and observed number of cases (E/O). We also explored re-calibrating the model to account for differences in population prevalence. Results: The cohort sizes used for validation were 424 (53 cases) for MLS and 637 (105 cases) for CHDS. AUCs for the two cohorts were 0.66 (MLS) and 0.73 (CHDS) and the corresponding E/O ratios (after recalibration) were 0.995 and 0.999. Conclusion: The external validation of the Add Health model on two different cohorts lends confidence to the model's ability to identify adolescent or young adult cannabis users at high risk of developing CUD in later life.
READ FULL TEXT