A whole genome-based gene-by-gene typing system for standardised high resolution strain typing of Bacillus anthracis
Whole-genome-sequencing (WGS) has been established for bacterial subtyping and is regularly used to study pathogen transmission, to investigate outbreaks, and to perform routine surveillance. Core genome multilocus sequence typing (cgMLST) is a bacterial subtyping method that uses WGS data to provide a high resolution strain characterisation. This study aimed at developing a novel cgMLST scheme for Bacillus anthracis, a notorious pathogen that causes anthrax in livestock and humans worldwide. The scheme comprises 3,803 genes that were conserved in 57 B. anthracis genomes spanning the whole phylogeny. The scheme has been evaluated and applied to 584 genomes from 50 countries. On average, 99.5% of the cgMLST targets were detected. The cgMLST results confirmed the classical canonical Single-Nucleotide-Polymorphisms (SNPs) grouping of B. anthracis into major clades and subclades. Genetic distances calculated based on cgMLST were comparable to distances from whole-genome-based SNP analysis with similar phylogenetic topology and comparable discriminatory power. Additionally, the application of the cgMLST scheme to anthrax outbreaks from Germany and Italy led to a definition of a cut-off threshold of five allele differences to trace epidemiologically linked strains for cluster typing and transmission analysis. Finally, the association of two clusters of B. anthracis to human cases of injectional anthrax in four European countries was confirmed using cgMLST. In summary, this study presents a novel cgMLST scheme that provides a high-resolution strain genotyping for B. anthracis. This scheme can be used in parallel with SNP typing methods to facilitate rapid and harmonised inter laboratory comparisons, essential for global surveillance and outbreak analysis. The scheme is publicly-available for application from users including those with little bioinformatics knowledge.