Statistics for Spatially Stratified Heterogeneous Data
Spatial statistics is dominated by spatial autocorrelation (SAC) based Kriging and BHM, and spatial local heterogeneity based hotspots and geographical regression methods, appraised as the first and second laws of Geography (Tobler 1970; Goodchild 2004), respectively. Spatial stratified heterogeneity (SSH), the phenomena of a partition that within strata is more similar than between strata, examples are climate zones and landuse classes and remote sensing classification, is prevalent in geography and understood since ancient Greek, is surprisingly neglected in Spatial Statistics, probably due to the existence of hundreds of classification algorithms. In this article, we go beyond the classifications and disclose that SSH is the sources of sample bias, statistic bias, modelling confounding and misleading CI, and recommend robust solutions to overcome the negativity. In the meantime, we elaborate four benefits from SSH: creating identical PDF or equivalent to random sampling in stratum; the spatial pattern in strata, the borders between strata as a specific information for nonlinear causation; and general interaction by overlaying two spatial patterns. We developed the equation of SSH and discuss its context. The comprehensive investigation formulates the statistics for SSH, presenting a new principle and toolbox in spatial statistics.
READ FULL TEXT