Variable selection with false discovery rate control in deep neural networks

09/17/2019
by   Zixuan Song, et al.
12

Deep neural networks (DNNs) are famous for their high prediction accuracy, but they are also known for their black-box nature and poor interpretability. We consider the problem of variable selection, that is, selecting the input variables that have significant predictive power on the output, in DNNs. We propose a backward elimination procedure called SurvNet, which is based on a new measure of variable importance that applies to a wide variety of networks. More importantly, SurvNet is able to estimate and control the false discovery rate of selected variables, while no existing methods provide such a quality control. Further, SurvNet adaptively determines how many variables to eliminate at each step in order to maximize the selection efficiency. To study its validity, SurvNet is applied to image data and gene expression data, as well as various simulation datasets.

READ FULL TEXT

page 1

page 4

page 5

page 9

page 14

page 15

page 18

page 22

research
09/29/2021

Deep neural networks with controlled variable selection for the identification of putative causal genetic variants

Deep neural networks (DNN) have been used successfully in many scientifi...
research
12/07/2019

Deep Variable-Block Chain with Adaptive Variable Selection

The architectures of deep neural networks (DNN) rely heavily on the unde...
research
04/09/2018

Efficient Predictor Ranking and False Discovery Proportion Control in High-Dimensional Regression

We propose a ranking and selection procedure to prioritize relevant pred...
research
07/07/2021

ENNS: Variable Selection, Regression, Classification and Deep Neural Network for High-Dimensional Data

High-dimensional, low sample-size (HDLSS) data problems have been a topi...
research
02/03/2023

Trade-off between prediction and FDR for high-dimensional Gaussian model selection

In the context of the high-dimensional Gaussian linear regression for or...
research
01/22/2020

Knockoffs with Side Information

We consider the problem of assessing the importance of multiple variable...
research
11/13/2008

P-values for high-dimensional regression

Assigning significance in high-dimensional regression is challenging. Mo...

Please sign up or login with your details

Forgot password? Click here to reset