Zero-inflated Beta distribution regression modeling
A frequent challenge encountered with ecological data is how to interpret, analyze, or model data having a high proportion of zeros. Much attention has been given to zero-inflated count data, whereas models for non-negative continuous data with an abundance of 0s are lacking. We consider zero-inflated data on the unit interval and provide modeling to capture two types of 0s in the context of the Beta regression model. We model 0s due to missing by chance through left censoring of a latent regression, and 0s due to unsuitability using an independent Bernoulli specification to create a point mass at 0. We first develop the model as a spatial regression in environmental features and then extend to introduce spatial random effects. We specify models hierarchically, employing latent variables, fit them within a Bayesian framework, and present new model comparison tools. Our motivating dataset consists of percent cover abundance of two plant species at a collection of sites in the Cape Floristic Region of South Africa. We find that environmental features enable learning about the incidence of both types of 0s as well as the positive percent covers. We also show that the spatial random effects model improves predictive performance. The proposed modeling enables ecologists, using environmental regressors, to extract a better understanding of the presence/absence of species in terms of absence due to unsuitability vs. missingness by chance, as well as abundance when present.
READ FULL TEXT