Pipelines
A description of the AMR detection software used in this study have been provided:
GeneFinder. Public Health England (PHE), UK24. URL: https://github.com/phe-bioinformatics/gene_finder. Operator: PHE. Language: python 2.7.5. Input format: FASTQ. Algorithm: mapping (bowtie 2.1.0). Reference database: provides three in house references sets in FASTA format for E.coli, Salmonella and Campylobacter. Users can incorporate their own reference set. Detection: presence or absence of sequences and mutations. It also reports insertions, deletions, mixed positions and large indels. Possibility to set the similarity thresholds (between sample DNA and a reference DNA) individually for each gene. Quality metrics: coverage, similarity, depth and coverage distribution.
APHA SeqFinder/ABRicate. Animal and Plant Health Agency (APHA), UK25. URL: https://github.com/APHA-AMR-VIR/APHASeqFinder. Operator: APHA. Language: python 3. Input format: FASTQ. Algorithm: mapping (smalt 0.7.6). Reference database: provides three in house reference sets in FASTA format for AMR genes, mutations and entero plasmids, virulence factors and heavy metal resistance. Detection: presence or absence of sequences and mutations. Quality metrics: coverage, similarity, depth and normalised depth by MLST genes. ABRicate [19] is used in conjunction with SeqFinder as an additional filter. URL: https://github.com/tseemann/abricate. Language: perl. Input format: FASTA assembled contigs. (SPAdes 3.13.1). Algorithm: BLAST 2.7.0 or higher. Reference database: same references as used for APHA SeqFinder (it also provides additional databases; NCBI, CARD, ARG-ANNOT, ResFinder, MEGARES, EcOH, PlasmidFinder, Ecoli_VF and VFDB). Detection: presence or absence of genes. Quality metrics: coverage and similarity
BLAST, Wageningen Bioveterinary Research (WBVR), The Netherlands. Pipeline not published at the time of this study. Operator: WBVR. Input format: FASTA assembled contigs. Algorithm: raw reads are error corrected with Tadpole from the BBduk suite v38.71. Quality trimming to Q20 with BBduk. Genomes are assembled using SPAdes 3.13.1. Assemblies are compared to the reference database using BLAST (with filters: 98% sequence identity and 97% gene coverage). Reference database: ResFinder database used. Detection: presence or absence of sequences and mutations. Quality metrics: sequence identity and gene coverage provided by BLAST.
ResFinder + PointFinder. 26,27Technical University of Denmark. URL: https://bitbucket.org/genomicepidemiology/resfinder/src/master/. Operator: The Norwegian Veterinary Institute (NVI). Language: python 3. Input format: FASTQ or FASTA assembled contigs. Algorithm: BLAST is used to analyse assemblies (FASTA files). Mapper KMA is used to analyse read data (FASTQ files). Reference database: RestFinder database and PointFinder_database (they cover several bacteria). Detection: presence or absence of sequences and mutations. Quality metrics:
ARIBA 28, Sanger Institute, UK. URL: https: //github.com/sanger-pathogens/ariba. Operator: Universidad Complutense de Madrid (UCM). Language: python 3. Input format: FASTQ. Algorithm: mapping (Bowtie 2.1.0). Reference database: does not provide its own reference database but has an integrated method to download and standardise one from different sources such as the Comprehensive Antibiotic Resistance Database (CARD), ResFinder, ARG-ANNOT, MEGARes, NCBI, PlasmidFinder, VFDB, SRST2 and VirulenceFinder. Users can incorporate their own reference set. Detection: presence or absence of sequences only (although it is possible to incorporate an external reference database for detecting mutations, currently, there is not an integrated and standardised database for mutations responsible for AMR). It also reports genetic fragmentations, interruptions, and duplications. Quality metrics: gene coverage, sequence identity.
The raw reads for isolates from this dataset are available in the NCBI nucleotide archive under project number PRJNA805266.
The antimicrobials
The evaluation was performed on the 14 antimicrobials used for AMR monitoring by the European Food and Safety (EFSA) [4/115]: Ampicillin, Azithromycin, Cefotaxime, Ceftazidime, Chloramphenicol, Ciprofloxacin, Colistin, Gentamicin, Meropenem, Nalidixic Acid, Sulfamethoxazole, Tetracycline, Tigecyline and Trimethoprim. The susceptibility of wild type E. coli to the panel of 14 antimicrobials was tested, and were categorised as susceptible or resistant by: susceptible, a bacterial isolate was defined as susceptible when it is inhibited at a concentration of a specific antimicrobial equal or lower than the established cut-off value; and resistant, a bacterial isolate was defined as resistant when it is not inhibited at a concentration of a specific antimicrobial higher than the established cut-off value following the ECOFF values for the MIC described by the EUCAST 23.
The isolates
A total of 438 E. coli isolates from were provided by 9 European institutes. The French Agency for Food, Environmental and Occupational Health and Safety, Lyon - France (49 isolates). Universidad Complutense de Madrid, Spain (50 isolates). Institute Pasteur, France (52 isolates). The German Federal Institute for Risk Assesment, Germany (50 isolates). The Norwegian Veterinary Institute, Norway (50 isolates). Wageningen Bioveterinary Research, The Netherlands (50 isolates). The University of Surrey, United Kingdom (50 isolates). Animal and Plant Health Agency, United Kingdom (37 isolates). Public Health England, United Kingdom (51 isolates).
Data analysis
The institutes provided the WGS data and the MIC values established using the EFSA protocol. In Table S1, a summary of the number of isolates per institute is provided. Also, the number (and percentage) of isolates presenting resistance (res) and susceptibility (sen) to each of the 14 antimicrobials is given. For some institutions-antimicrobial, MIC values were not available for part or all the isolates, this has been marked with an asterisk. Most cases from human samples (PHE) for which some antimicrobials not affecting humans are not routinely screened (azithromycin, chloramphenicol, sulfamethoxazole, tetracycline and trimethoprim). Note that for some antibiotics, institutes were unable to provide isolates showing resistance.
Analysis for all 438 isolates was performed by five independent operators (one per pipeline). Result tables from the pipeline runs were sent to each of the nine donor institutes who extracted their corresponding isolates results for all five pipelines. The information was recorded on a standardised form composed of 14 sub-tables; one reporting for each isolate (rows), the gold standard classification (resistance or sensitivity using the ECOFF interpretation) of the MIC values and the AMR elements (genes or chromosomal mutations) detected for each of the 14 antimicrobials.
For this study, values in the tables were interpreted as ones or zeroes depending on whether any AMR genotype was detected for each isolate-antimicrobial cell. These values were then compared to the gold standard (ECOFF interpretation of the MIC values) to calculate the sensitivity and specificity and their 95% confidence intervals (shown is Table 2).