Adaptive preferential sampling in phylodynamics

09/04/2020
by   Lorenzo Cappello, et al.
0

Longitudinal molecular data of rapidly evolving viruses and pathogens provide information about disease spread and complement traditional surveillance approaches based on case count data. The coalescent is used to model the genealogy that represents the sample ancestral relationships. The basic assumption is that coalescent events occur at a rate inversely proportional to the effective population size N_e(t), a time-varying measure of genetic diversity. When the sampling process (collection of samples over time) depends on N_e(t), the coalescent and the sampling processes can be jointly modeled to improve estimation of N_e(t). Failing to do so can lead to bias due to model misspecification. However, the way that the sampling process depends on the effective population size may vary over time. We introduce an approach where the sampling process is modeled as an inhomogeneous Poisson process with rate equal to the product of N_e(t) and a time-varying coefficient, making minimal assumptions on their functional shapes via Markov random field priors. We provide scalable algorithms for inference, show the model performance vis-a-vis alternative methods in a simulation study, and apply our model to SARS-CoV-2 sequences from Los Angeles and Santa Clara counties. The methodology is implemented and available in the R package adapref.

READ FULL TEXT

page 17

page 18

page 29

research
03/28/2019

Estimating effective population size changes from preferentially sampled genetic sequences

Coalescent theory combined with statistical modeling allows us to estima...
research
08/30/2023

Semiparametric inference of effective reproduction number dynamics from wastewater pathogen surveillance data

Concentrations of pathogen genomes measured in wastewater have recently ...
research
07/21/2021

Tracking the Transmission Dynamics of COVID-19 with a Time-Varying Coefficient State-Space Model

The spread of COVID-19 has been greatly impacted by regulatory policies ...
research
10/24/2018

Estimating abundance from multiple sampling capture-recapture data via a multi-state multi-period stopover model

The collection of capture-recapture data often involves collecting data ...
research
08/13/2018

Locally-adaptive Bayesian nonparametric inference for phylodynamics

Phylodynamics is an area of population genetics that uses genetic sequen...
research
03/02/2021

Time-varying ℓ_0 optimization for Spike Inference from Multi-Trial Calcium Recordings

Optical imaging of genetically encoded calcium indicators is a powerful ...
research
03/13/2013

A dependent partition-valued process for multitask clustering and time evolving network modelling

The fundamental aim of clustering algorithms is to partition data points...

Please sign up or login with your details

Forgot password? Click here to reset