Explainable Censored Learning: Finding Critical Features with Long Term Prognostic Values for Survival Prediction
Interpreting critical variables involved in complex biological processes related to survival time can help understand prediction from survival models, evaluate treatment efficacy, and develop new therapies for patients. Currently, the predictive results of deep learning (DL)-based models are better than or as good as standard survival methods, they are often disregarded because of their lack of transparency and little interpretability, which is crucial to their adoption in clinical applications. In this paper, we introduce a novel, easily deployable approach, called EXplainable CEnsored Learning (EXCEL), to iteratively exploit critical variables and simultaneously implement (DL) model training based on these variables. First, on a toy dataset, we illustrate the principle of EXCEL; then, we mathematically analyze our proposed method, and we derive and prove tight generalization error bounds; next, on two semi-synthetic datasets, we show that EXCEL has good anti-noise ability and stability; finally, we apply EXCEL to a variety of real-world survival datasets including clinical data and genetic data, demonstrating that EXCEL can effectively identify critical features and achieve performance on par with or better than the original models. It is worth pointing out that EXCEL is flexibly deployed in existing or emerging models for explainable survival data in the presence of right censoring.
READ FULL TEXT