A Conway-Maxwell-Multinomial Distribution for Flexible Modeling of Clustered Categorical Data
Categorical data are often observed as counts resulting from a fixed number of trials in which each trial consists of making one selection from a prespecified set of categories. The multinomial distribution serves as a standard model for such clustered data but assumes that trials are independent and identically distributed. Extensions such as Dirichlet-multinomial and random-clumped multinomial can express positive association, where trials are more likely to result in a common category due to membership in a common cluster. This work considers a Conway-Maxwell-multinomial (CMM) distribution for modeling clustered categorical data exhibiting positively or negatively associated trials. The CMM distribution features a dispersion parameter which allows it to adapt to a range of association levels and includes several recognizable distributions as special cases. We explore properties of CMM, illustrate its flexible characteristics, identify a method to efficiently compute maximum likelihood (ML) estimates, present simulations of small sample properties under ML estimation, and demonstrate the model via several data analysis examples.
READ FULL TEXT