Addressing Detection Limits with Semiparametric Cumulative Probability Models
Detection limits (DLs), where a variable is unable to be measured outside of a certain range, are common in research. Most approaches to handle DLs in the response variable implicitly make parametric assumptions on the distribution of data outside DLs. We propose a new approach to deal with DLs based on a widely used ordinal regression model, the cumulative probability model (CPM). The CPM is a type of semiparametric linear transformation model. CPMs are rank-based and can handle mixed distributions of continuous and discrete outcome variables. These features are key for analyzing data with DLs because while observations inside DLs are typically continuous, those outside DLs are censored and generally put into discrete categories. With a single lower DL, the CPM assigns values below the DL as having the lowest rank. When there are multiple DLs, the CPM likelihood can be modified to appropriately distribute probability mass. We demonstrate the use of CPMs with simulations and two HIV data examples. The first example models a biomarker in which 15 observations are below a DL. The second uses multi-cohort data to model viral load, where approximately 55 sites and over time.
READ FULL TEXT