Predicting Survival Outcomes in the Presence of Unlabeled Data

by   Fateme Nateghi Haredasht, et al.

Many clinical studies require the follow-up of patients over time. This is challenging: apart from frequently observed drop-out, there are often also organizational and financial challenges, which can lead to reduced data collection and, in turn, can complicate subsequent analyses. In contrast, there is often plenty of baseline data available of patients with similar characteristics and background information, e.g., from patients that fall outside the study time window. In this article, we investigate whether we can benefit from the inclusion of such unlabeled data instances to predict accurate survival times. In other words, we introduce a third level of supervision in the context of survival analysis, apart from fully observed and censored instances, we also include unlabeled instances. We propose three approaches to deal with this novel setting and provide an empirical comparison over fifteen real-life clinical and gene expression survival datasets. Our results demonstrate that all approaches are able to increase the predictive performance over independent test data. We also show that integrating the partial supervision provided by censored data in a semi-supervised wrapper approach generally provides the best results, often achieving high improvements, compared to not using unlabeled data.


page 1

page 2

page 3

page 4


Positive-Unlabelled Survival Data Analysis

In this paper, we consider a novel framework of positive-unlabeled data ...

A Deep Active Survival Analysis Approach for Precision Treatment Recommendations: Application of Prostate Cancer

Survival analysis has been developed and applied in the number of areas ...

A Multi-Modal Graph-Based Semi-Supervised Pipeline for Predicting Cancer Survival

Cancer survival prediction is an active area of research that can help p...

Mixed Semi-Supervised Generalized-Linear-Regression with applications to Deep learning

We present a methodology for using unlabeled data to design semi supervi...

Analysis of dynamic restricted mean survival time based on pseudo-observations

In clinical follow-up studies with a time-to-event end point, the differ...

Predicting phenotypes from microarrays using amplified, initially marginal, eigenvector regression

Motivation: The discovery of relationships between gene expression measu...

Please sign up or login with your details

Forgot password? Click here to reset