Genome-wide association reveals host-specific genomic traits in Escherichia coli : [Preprint]
Escherichia coli is an opportunistic pathogen that can colonize or infect various host species. There is a significant gap in our understanding to what extent genetic lineages of E. coli are adapted or restricted to specific hosts. In addition, genomic determinants underlying such host specificity are unknown. By analyzing a randomly sampled collection of 1198 whole-genome sequenced E. coli isolates from four countries (Germany, UK, Spain, and Vietnam), obtained from five host species (human, pig, cattle, chicken, and wild boar) over 16 years, from both healthy and diseased hosts, we demonstrate that certain lineages of E. coli are frequently detected in specific hosts. We report a novel nan gene cluster, designated nan-9, putatively encoding acetylesterases and determinants of uptake and metabolism of sialic acid, to be associated with the human host as identified through genome wide association studies. In silico characterization predicts nan-9 to be involved in sialic acid (Sia) metabolism. In vitro growth experiments with a representative Δnan E. coli mutant strain, using sialic acids 5-N-acetyl neuraminic acid (Neu5Ac) and N-glycolyl neuraminic acid (Neu5Gc) as the sole carbon source, indicate an impaired growth behaviour compared to the wild-type. In addition, we identified several additional E. coli genes that are potentially associated with adaptation to human, cattle and chicken hosts, but not for the pig host. Collectively, this study provides an extensive overview of genetic determinants which may mediate host specificity in E. coli. Our findings should inform risk analysis and epidemiological monitoring of (antimicrobial resistant) E. coli.