SNPs genotyping panel design for SARS-CoV-2 variants
The trimmed SARS-CoV-2 genome sequences were downloaded from the COVID-19 Genomics UK (COG-UK) consortium website (https://www.cogconsortium. uk/data/) and from NCBI SARS-CoV-2 Resources (https://www.ncbi.nlm.nih.gov/sars-cov-2/). Genotyping panel probe was specifically designed to target Spike (S) protein mutation characteristics of each SARS-CoV-2 variants of concern, as reported on PANGO lineage (https://cov-lineages.org/index.html). For B.1.1.7 variant we selected D614G, N501Y, H69_V70del, Y145del, A570D, P681H, T716I, S982A, and D1118H substitutions; for B.1.351 variant we selected D614G, N501Y, E484K, D215G, and A701V substitutions; for P.1 variant we selected D614G, N501Y, E484K, K417T, D138Y, and T1027I substitutions. Using Custom TaqMan Assay Design Tool sequence-specific forward and reverse primers were designed to amplify the target sequence region (Thermo Fisher Scientific). Each assay includes two TaqMan minor groove binder (MGB) probes with nonfluorescent quenchers (NFQ): a VIC dye labeled probe to detect the reference sequence and a FAM dye labeled probe to detect the mutation sequence in Spike (S) gene.
Criteria for SARS-CoV-2 variants definition
As criteria to SARS-CoV-2 variant characterization we consider the presence of D614G and N501Y mutations as first filter criteria, as the second step the presence of H69_V70del, Y145del, A570D, P681H, T716I, S982A, D1118H SNPs for B.1.1.7 variant; D215G, E484K, A701V for B.1.351 and K417T, E484K, D138Y, T1027I for P.1 variant.
Libraries Preparation and Sequencing
For each sample target amplification reaction was set up using 10 µL of cDNA, 4.5 µL of 5X Ion AmpliSeq™ HiFi Mix, and 3.5 µL of water; this mixture was splitted into two different tubes and 2 µL of each of the 5X Ion AmpliSeq™ Primer Pool 1 and 2 were added to the corresponding tubes. Reaction of amplification was performed in Thermal Cycler with the following program: 98°C for 2 min, followed by 16 cycles at 98°C for 15 s and 60°C for 4 min. The previous reactions were then combined together and 2 µL FuPa Reagent were added to partially digest the primers (Thermo Fisher Scientific), afterward the mixture was incubated in Thermal Cycler with the following program: 50°C for 10 min, 55°C for 10 min and 60°C for 20 min. Then, 2 µL of diluted Ion Xpress™ Barcode Adapters together with 4 µL of Switch Solution and 2 µL DNA Ligase were added to ligate the adapters to the amplified products, and the samples were incubated with the following program: 22°C for 30 min, 68°C for 5 min, and 72°C 5 min. After ligation, each DNA library was purified with the magnetic beads (Agencourt™ AMPure™ XP Reagent, Beckman Coulter) and then amplified with 50 µL of Platinum™ PCR SuperMix HiFi and 2 µL of Library Amplification Primer Mix using the following conditions: 2 min at 98°C, 5 cycles of 15 s at 98°C and 1 min at 64°C. The amplified libraries were again purified with magnetic beads, and the final concentration of each barcoded cDNA library was determined with Qubit 3.0 fluorometer (Thermo Fisher Scientific, Waltham, MA, USA) following the manufacturer recommendations. Barcoded libraries were diluted to 33 pM, pooled in equal volume aliquots, and then loaded on to the Ion Chef™ Instrument (ThermoFisher Scientific) for emulsion PCR, enrichment, and loading into the Ion S5 Prime System (TermoFisher Scientific). From 20 to 32 Sample are pooled together and sequenced on Ion 530 Chip (ThermoFisher Scientific). Torrent Suite™ software was used to compare base calls, read alignments, and variant calling. Reads were aligned with the Wuhan-Hu-1 NCBI Reference Genome on the Torrent Suite v. 5.10.1. The following plugins were used: Coverage Analysis (v5.10.0.3), Variant Caller (v.5.10.1.19) and COVID19 AnnotateSnpEff (v.1.0.0), a plugin specifically developed for Sars-Cov-2 that can predict the effect of a base substitution. The software Integrative Genomic Viewer v.2.8.0 (IGV) was used to visualize the TVC (Torrent Variant Caller) bam files to check the consistency of nucleotides calls (Alessandrini et al., 2020)
SARS-CoV-2 variants and mutations characterization
Raw sequence reads were aligned to the complete genome of SARS-CoV-2 Wuhan-Hu-1 isolate (Genbank accession number: NC_045512.2) and classified using the Pangolin COVID-19 Lineage Assigner tool v2.0.7 (github.com/cov-lineages/pangolin). Mutations of external dataset was defined uploading the FASTA sequences by NextClade website (https://clades.nextstrain.org/).
External Validation Dataset
All SARS-CoV-2 genomes sequences were retrieved from NCBI Virus (https://www.ncbi.nlm.nih.gov/labs/virus/vssi/#/). The criteria for download were the presence of all complete genome sequences and all Italian submitted sequences. The sequences met these criteria were 82. The FASTA file sequences was submitted to Pangolin tools for variant calling and to NextClade website for mutations calling.