Model diagnostics of discrete data regression: a unifying framework using functional residuals
Model diagnostics is an indispensable component of regression analysis, yet it is not well addressed in standard textbooks on generalized linear models. The lack of exposition is attributed to the fact that when outcome data are discrete, classical methods (e.g., Pearson/deviance residual analysis and goodness-of-fit tests) have limited utility in model diagnostics and treatment. This paper establishes a novel framework for model diagnostics of discrete data regression. Unlike the literature defining a single-valued quantity as the residual, we propose to use a function as a vehicle to retain the residual information. In the presence of discreteness, we show that such a functional residual is appropriate for summarizing the residual randomness that cannot be captured by the structural part of the model. We establish its theoretical properties, which leads to the innovation of new diagnostic tools including the functional-residual-vs covariate plot and Function-to-Function (Fn-Fn) plot. Our numerical studies demonstrate that the use of these tools can reveal a variety of model misspecifications, such as not properly including a higher-order term, an explanatory variable, an interaction effect, a dispersion parameter, or a zero-inflation component. The functional residual yields, as a byproduct, Liu-Zhang's surrogate residual mainly developed for cumulative link models for ordinal data (Liu and Zhang, 2018, JASA). As a general notion, it considerably broadens the diagnostic scope as it applies to virtually all parametric models for binary, ordinal and count data, all in a unified diagnostic scheme.
READ FULL TEXT