Small area estimation for grouped data
This paper proposes a new model-based approach to small area estimation for grouped data or frequency data, which is often available from sample surveys. Grouped data contains information on frequencies of some pre-specified groups in each area, for example the numbers of households in the income classes, and thus provides more detailed insight about small areas than area level aggregated data. A direct application of the widely used small area methods, such as the Fay-Herriot model for area level data and nested error regression model for unit level data, is not appropriate since they are not designed for grouped data. The newly proposed method assumes that the unobserved unit level quantity of interest follows a linear mixed model with the random intercepts and dispersions after some transformation. Then the probabilities that a unit belongs to the groups can be derived and are used to construct the likelihood function for the grouped data given the random effects, which is in the form of the multinomial likelihood. The unknown model parameters (hyperparameters) are estimated by a newly developed Monte Carlo EM algorithm using an efficient importance sampling. The empirical best predicts (empirical Bayes estimates) of small area parameters can be calculated by a simple Gibbs sampling algorithm. The numerical performance of the proposed method is illustrated based on the model-based and design-based simulations. In the application to the city level grouped income data of Japan, we complete the patchy maps of the Gini coefficient as well as mean income across the country.
READ FULL TEXT