A modelling framework for the prediction of the herd-level probability of infection from longitudinal data
The collective control programmes (CPs) that exist for many infectious diseases of farm animals rely on the application of diagnostic testing at regular time intervals for the identification of infected animals or herds. The diversity of these CPs complicates the trade of animals between regions or countries because the definition of freedom from infection differs from one CP to another. In this paper, we describe a statistical model for the prediction of herd-level probabilities of infection from longitudinal data collected as part of CPs against infectious diseases of cattle. The model was applied to data collected as part of a CP against bovine viral diarrhoea virus (BVDV) infection in Loire-Atlantique, France. The model represents infection as a herd latent status with a monthly dynamics. This latent status determines test results through test sensitivity and test specificity. The probability of becoming status positive between consecutive months is modelled as a function of risk factors (when available) using logistic regression. Modelling is performed in a Bayesian framework, using either Stan or JAGS. Prior distributions need to be provided for the sensitivities and specificities of the different tests used, for the probability of remaining status positive between months as well as for the probability of becoming positive between months. When risk factors are available, prior distributions need to be provided for the coefficients of the logistic regression, replacing the prior for the probability of becoming positive. From these prior distributions and from the longitudinal data, the model returns posterior probability distributions for being status positive for all herds on the current month. Data from the previous months are used for parameter estimation. The impact of using different prior distributions and model implementations on parameter estimation was evaluated. The main advantage of this model is its ability to predict a probability of being status positive in a month from inputs that can vary in terms of nature of test, frequency of testing and risk factor availability/presence. The main challenge in applying the model to the BVDV CP data was in identifying prior distributions, especially for test characteristics, that corresponded to the latent status of interest, i.e. herds with at least one persistently infected (PI) animal. The model is available on Github as an R package (https://github.com/AurMad/STOCfree) and can be used to carry out output-based evaluation of disease CPs.