Empirical Study of Overfitting in Deep FNN Prediction Models for Breast Cancer Metastasis

08/03/2022
by   Chuhan Xu, et al.
12

Overfitting is defined as the fact that the current model fits a specific data set perfectly, resulting in weakened generalization, and ultimately may affect the accuracy in predicting future data. In this research we used an EHR dataset concerning breast cancer metastasis to study overfitting of deep feedforward Neural Networks (FNNs) prediction models. We included 11 hyperparameters of the deep FNNs models and took an empirical approach to study how each of these hyperparameters was affecting both the prediction performance and overfitting when given a large range of values. We also studied how some of the interesting pairs of hyperparameters were interacting to influence the model performance and overfitting. The 11 hyperparameters we studied include activate function; weight initializer, number of hidden layers, learning rate, momentum, decay, dropout rate, batch size, epochs, L1, and L2. Our results show that most of the single hyperparameters are either negatively or positively corrected with model prediction performance and overfitting. In particular, we found that overfitting overall tends to negatively correlate with learning rate, decay, batch sides, and L2, but tends to positively correlate with momentum, epochs, and L1. According to our results, learning rate, decay, and batch size may have a more significant impact on both overfitting and prediction performance than most of the other hyperparameters, including L1, L2, and dropout rate, which were designed for minimizing overfitting. We also find some interesting interacting pairs of hyperparameters such as learning rate and momentum, learning rate and decay, and batch size and epochs. Keywords: Deep learning, overfitting, prediction, grid search, feedforward neural networks, breast cancer metastasis.

READ FULL TEXT

page 1

page 7

page 10

page 11

research
08/21/2022

Critical Bach Size Minimizes Stochastic First-Order Oracle Complexity of Deep Learning Optimizer using Hyperparameters Close to One

Practical results have shown that deep learning optimizers using small c...
research
11/01/2017

Don't Decay the Learning Rate, Increase the Batch Size

It is common practice to decay the learning rate. Here we show one can u...
research
03/26/2018

A disciplined approach to neural network hyper-parameters: Part 1 -- learning rate, batch size, momentum, and weight decay

Although deep learning has produced dazzling successes for applications ...
research
09/04/2022

Towards Understanding the Overfitting Phenomenon of Deep Click-Through Rate Prediction Models

Deep learning techniques have been applied widely in industrial recommen...
research
03/11/2021

Intraclass clustering: an implicit learning ability that regularizes DNNs

Several works have shown that the regularization mechanisms underlying d...
research
08/12/2015

The Effects of Hyperparameters on SGD Training of Neural Networks

The performance of neural network classifiers is determined by a number ...
research
05/26/2023

Rotational Optimizers: Simple Robust DNN Training

The training dynamics of modern deep neural networks depend on complex i...

Please sign up or login with your details

Forgot password? Click here to reset