Bayesian Spatial Homogeneity Pursuit of Functional Data: an Application to the U.S. Income Distribution
An income distribution describes how an entity's total wealth is distributed amongst its population. In economics, the Lorenz curve is a well-known functional representation of income distribution. Clustering of Lorenz curves based on both their similarities and spatial adjacencies is motivated by examining the household incomes in each state from the American Community Survey Public Use Microdata Sample (PUMS) data. We propose a mixture of finite mixtures (MFM) model as well as a Markov random field constrained mixture of finite mixtures (MRFC-MFM) model in the context of spatial functional data analysis to capture spatial homogeneity of Lorenz curves. We design efficient Markov chain Monte Carlo (MCMC) algorithms to simultaneously infer the posterior distributions of the number of clusters and the clustering configuration of spatial functional data. Extensive simulation studies are carried out to demonstrate the effectiveness of the proposed methods. Finally, we apply our proposed algorithms to the state level income distributions from the PUMS data. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.
READ FULL TEXT