An Automatic Interaction Detection Hybrid Model for Bankcard Response Classification

by   Yan Wang, et al.

In this paper, we propose a hybrid bankcard response model, which integrates decision tree based chi-square automatic interaction detection (CHAID) into logistic regression. In the first stage of the hybrid model, CHAID analysis is used to detect the possibly potential variable interactions. Then in the second stage, these potential interactions are served as the additional input variables in logistic regression. The motivation of the proposed hybrid model is that adding variable interactions may improve the performance of logistic regression. To demonstrate the effectiveness of the proposed hybrid model, it is evaluated on a real credit customer response data set. As the results reveal, by identifying potential interactions among independent variables, the proposed hybrid approach outperforms the logistic regression without searching for interactions in terms of classification accuracy, the area under the receiver operating characteristic curve (ROC), and Kolmogorov-Smirnov (KS) statistics. Furthermore, CHAID analysis for interaction detection is much more computationally efficient than the stepwise search mentioned above and some identified interactions are shown to have statistically significant predictive power on the target variable. Last but not least, the customer profile created based on the CHAID tree provides a reasonable interpretation of the interactions, which is the required by regulations of the credit industry. Hence, this study provides an alternative for handling bankcard classification tasks.


page 1

page 4

page 7


A two-stage hybrid model by using artificial neural networks as feature construction algorithms

We propose a two-stage hybrid approach with neural networks as the new f...

The MELODIC family for simultaneous binary logistic regression in a reduced space

Logistic regression is a commonly used method for binary classification....

A Descriptive Study of Variable Discretization and Cost-Sensitive Logistic Regression on Imbalanced Credit Data

Training classification models on imbalanced data sets tends to result i...

Regularised Text Logistic Regression: Key Word Detection and Sentiment Classification for Online Reviews

Online customer reviews have become important for managers and executive...

(A) Data in the Life: Authorship Attribution of Lennon-McCartney Songs

The songwriting duo of John Lennon and Paul McCartney, the two founding ...

Variable Grouping Based Bayesian Additive Regression Tree

Using ensemble methods for regression has been a large success in obtain...

Classifying variety of customer's online engagement for churn prediction with mixed-penalty logistic regression

Using big data to analyze consumer behavior can provide effective decisi...

Please sign up or login with your details

Forgot password? Click here to reset