Quadratic Discriminant Analysis under Moderate Dimension
Quadratic discriminant analysis (QDA) is a simple method to classify a subject into two populations, and was proven to perform as well as the Bayes rule when the data dimension p is fixed. The main purpose of this paper is to examine the empirical and theoretical behaviors of QDA where p grows proportionally to the sample sizes without imposing any structural assumption on the parameters. The first finding in this moderate dimension regime is that QDA can perform as poorly as random guessing even when the two populations deviate significantly. This motivates a generalized version of QDA that automatically adapts to dimensionality. Under a finite fourth moment condition, we derive misclassification rates for both the generalized QDA and the optimal one. A direct comparison reveals one "easy" case where the difference between two rates converges to zero and one "hard" case where that converges to some strictly positive constant. For the latter, a divide-and-conquer approach over dimension (rather than sample) followed by a screening procedure is proposed to narrow the gap. Various numerical studies are conducted to back up the proposed methodology.
READ FULL TEXT