Model Selection for Mixture Models - Perspectives and Strategies

12/24/2018
by   Gilles Celeux, et al.
0

Determining the number G of components in a finite mixture distribution is an important and difficult inference issue. This is a most important question, because statistical inference about the resulting model is highly sensitive to the value of G. Selecting an erroneous value of G may produce a poor density estimate. This is also a most difficult question from a theoretical perspective as it relates to unidentifiability issues of the mixture model. This is further a most relevant question from a practical viewpoint since the meaning of the number of components G is strongly related to the modelling purpose of a mixture distribution. We distinguish in this chapter between selecting G as a density estimation problem in Section 2 and selecting G in a model-based clustering framework in Section 3. Both sections discuss frequentist as well as Bayesian approaches. We present here some of the Bayesian solutions to the different interpretations of picking the "right" number of components in a mixture, before concluding on the ill-posed nature of the question.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset