Discovering Effect Modification and Randomization Inference in Air Pollution Studies
Studies have shown that exposure to air pollution, even at low levels, significantly increases mortality. As regulatory actions are becoming prohibitively expensive, robust evidence to guide the development of targeted interventions to reduce air pollution exposure is needed. In this paper, we introduce a novel statistical method that splits the data into two subsamples: (a) Using the first subsample, we consider a data-driven search for de novo discovery of subgroups that could have exposure effects that differ from the population mean; and then (b) using the second subsample, we quantify evidence of effect modification among the subgroups with nonparametric randomization-based tests. We also develop a sensitivity analysis method to assess the robustness of the conclusions to unmeasured confounding bias. Via simulation studies and theoretical arguments, we demonstrate that since we discover the subgroups in the first subsample, hypothesis testing on the second subsample can focus on theses subgroups only, thus substantially increasing the statistical power of the test. We apply our method to the data of 1,612,414 Medicare beneficiaries in New England region in the United States for the period 2000 to 2006. We find that seniors aged between 81-85 with low income and seniors aged above 85 have statistically significant higher causal effects of exposure to PM_2.5 on 5-year mortality rate compared to the population mean.
READ FULL TEXT