Missing Values and the Dimensionality of Expected Returns

07/21/2022
by   Andrew Y. Chen, et al.
0

Combining 100+ cross-sectional predictors requires either dropping 90 data or imputing missing values. We compare imputation using the expectation-maximization algorithm with simple ad-hoc methods. Surprisingly, expectation-maximization and ad-hoc methods lead to similar results. This similarity happens because predictors are largely independent: Correlations cluster near zero and more than 10 principal components are required to span 50 uninformative about missing predictors, making ad-hoc methods valid. In an out-of-sample principal components (PC) regression test, 50 PCs are required to capture equal-weighted long-short expected returns (30 PCs value-weighted), regardless of the imputation method.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro