Simultaneous Modeling of Disease Screening and Severity Prediction: A Multi-task and Sparse Regularization Approach
Disease prediction is one of the central problems in biostatistical research. Some biomarkers are not only helpful in diagnosing and screening diseases but also associated with the severity of the diseases. It should be helpful to construct a prediction model that can estimate severity at the diagnosis or screening stage from perspectives such as treatment prioritization. We focus on solving the combined tasks of screening and severity prediction, considering a combined response variable such as {healthy, mild, intermediate, severe}. This type of response variable is ordinal, but since the two tasks do not necessarily share the same statistical structure, the conventional cumulative logit model (CLM) may not be suitable. To handle the composite ordinal response, we propose the Multi-task Cumulative Logit Model (MtCLM) with structural sparse regularization. This model is sufficiently flexible that can fit the different structures of the two tasks and capture their shared structure of them. In addition, MtCLM is valid as a stochastic model in the entire predictor space, unlike another conventional and flexible model, the non-parallel cumulative logit model (NPCLM). We conduct simulation experiments and real data analysis to illustrate the prediction performance and interpretability.
READ FULL TEXT