Optimal survival trees ensemble

by   Naz Gul, et al.

Recent studies have adopted an approach of selecting accurate and diverse trees based on individual or collective performance within an ensemble for classification and regression problems. This work follows in the wake of these investigations and considers the possibility of growing a forest of optimal survival trees. Initially, a large set of survival trees are grown using the method of random survival forest. The grown trees are then ranked from smallest to highest value of their prediction error using out-of-bag observations for each respective survival tree. The top ranked survival trees are then assessed for their collective performance as an ensemble. This ensemble is initiated with the survival tree which stands first in rank, then further trees are tested one by one by adding them to the ensemble in order of rank. A survival tree is selected for the resultant ensemble if the performance improves after an assessment using independent training data. This ensemble is called an optimal trees ensemble (OSTE). The proposed method is assessed using 17 benchmark datasets and the results are compared with those of random survival forest, conditional inference forest, bagging and a non tree based method, the Cox proportional hazard model. In addition to improve predictive performance, the proposed method reduces the number of survival trees in the ensemble as compared to the other tree based methods. The method is implemented in an R package called "OSTE".


page 1

page 2

page 3

page 4


Optimal trees selection for classification via out-of-bag assessment and sub-bagging

The effect of training data size on machine learning methods has been we...

ROC-Guided Survival Trees and Forests

Tree-based methods are popular nonparametric tools in studying time-to-e...

Optimal Survival Trees

Tree-based models are increasingly popular due to their ability to ident...

An Ensemble Method for Interval-Censored Time-to-Event Data

Interval-censored data analysis is important in biomedical statistics fo...

Ensemble representation learning: an analysis of fitness and survival for wrapper-based genetic programming methods

Recently we proposed a general, ensemble-based feature engineering wrapp...

Dynamic Ensemble Size Adjustment for Memory Constrained Mondrian Forest

Supervised learning algorithms generally assume the availability of enou...

Area-norm COBRA on Conditional Survival Prediction

The paper explores a different variation of combined regression strategy...

Please sign up or login with your details

Forgot password? Click here to reset