Consistently Estimating Markov Chains with Noisy Aggregate Data
We address the problem of estimating the parameters of a time-homogeneous Markov chain given only noisy, aggregate data. This arises when a population of individuals behave independently according to a Markov chain, but individual sample paths cannot be observed due to limitations of the observation process or the need to protect privacy. Instead, only population-level counts of the number of individuals in each state at each time step are available. When these counts are exact, a conditional least squares (CLS) estimator is known to be consistent and asymptotically normal. We initiate the study of method of moments estimators for this problem to handle the more realistic case when observations are additionally corrupted by noise. We show that CLS can be interpreted as a simple "plug-in" method of moments estimator. However, when observations are noisy, it is not consistent because it fails to account for additional variance introduced by the noise. We develop a new, simpler method of moments estimator that bypasses this problem and is consistent under noisy observations.
READ FULL TEXT