A Bayesian framework for case-cohort Cox regression: application to dietary epidemiology
The case-cohort study design bypasses resource constraints by collecting certain expensive covariates for only a small subset of the full cohort. Weighted Cox regression is the most widely used approach for analyzing case-cohort data within the Cox model, but is inefficient. Alternative approaches based on multiple imputation and nonparametric maximum likelihood suffer from incompatibility and computational issues respectively. We introduce a novel Bayesian framework for case-cohort Cox regression that avoids the aforementioned problems. Users can include auxiliary variables to help predict the unmeasured expensive covariates with a prediction model of their choice, while the models for the nuisance parameters are nonparametrically specified and integrated out. Posterior sampling can be carried out using procedures based on the pseudo-marginal MCMC algorithm. The method scales effectively to large, complex datasets, as demonstrated in our application: investigating the associations between saturated fatty acids and type 2 diabetes using the EPIC-Norfolk study. As part of our analysis, we also develop a new approach for handling compositional data in the Cox model, leading to more reliable and interpretable results compared to previous studies. The performance of our method is illustrated with extensive simulations, including the use of synthetic data generated by resampling from the application dataset. The code used to produce the results in this paper can be found at https://github.com/andrewyiu/bayes_cc .
READ FULL TEXT