An Automatic Interaction Detection Hybrid Model for Bankcard Response Classification

01/02/2019
by   Yan Wang, et al.
12

In this paper, we propose a hybrid bankcard response model, which integrates decision tree based chi-square automatic interaction detection (CHAID) into logistic regression. In the first stage of the hybrid model, CHAID analysis is used to detect the possibly potential variable interactions. Then in the second stage, these potential interactions are served as the additional input variables in logistic regression. The motivation of the proposed hybrid model is that adding variable interactions may improve the performance of logistic regression. To demonstrate the effectiveness of the proposed hybrid model, it is evaluated on a real credit customer response data set. As the results reveal, by identifying potential interactions among independent variables, the proposed hybrid approach outperforms the logistic regression without searching for interactions in terms of classification accuracy, the area under the receiver operating characteristic curve (ROC), and Kolmogorov-Smirnov (KS) statistics. Furthermore, CHAID analysis for interaction detection is much more computationally efficient than the stepwise search mentioned above and some identified interactions are shown to have statistically significant predictive power on the target variable. Last but not least, the customer profile created based on the CHAID tree provides a reasonable interpretation of the interactions, which is the required by regulations of the credit industry. Hence, this study provides an alternative for handling bankcard classification tasks.

READ FULL TEXT

page 1

page 4

page 7

research
12/06/2018

A two-stage hybrid model by using artificial neural networks as feature construction algorithms

We propose a two-stage hybrid approach with neural networks as the new f...
research
02/16/2021

The MELODIC family for simultaneous binary logistic regression in a reduced space

Logistic regression is a commonly used method for binary classification....
research
12/28/2018

A Descriptive Study of Variable Discretization and Cost-Sensitive Logistic Regression on Imbalanced Credit Data

Training classification models on imbalanced data sets tends to result i...
research
09/09/2020

Regularised Text Logistic Regression: Key Word Detection and Sentiment Classification for Online Reviews

Online customer reviews have become important for managers and executive...
research
06/12/2019

(A) Data in the Life: Authorship Attribution of Lennon-McCartney Songs

The songwriting duo of John Lennon and Paul McCartney, the two founding ...
research
11/03/2019

Variable Grouping Based Bayesian Additive Regression Tree

Using ensemble methods for regression has been a large success in obtain...
research
05/17/2021

Classifying variety of customer's online engagement for churn prediction with mixed-penalty logistic regression

Using big data to analyze consumer behavior can provide effective decisi...

Please sign up or login with your details

Forgot password? Click here to reset