Background: Nucleic acid amplification is the main method used to detect infections of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). However, the false-negative rate of nucleic acid tests cannot be ignored.
Methods: Herein, we demonstrated genomic variations at the target sequences for the tests and the geographical distribution of the variations across countries by analyzing the whole-genome sequencing data of SARS-CoV-2 strains from the 2019 Novel Coronavirus Resource (2019nCoVR) database.
Results: Among the 21 pairs of primer sequences in regions ORF1ab, S, E, and N, the total length of primer and probe target sequences was 938bp, with 131(13.97%) variant loci in 2415 (38.96%) isolates. Primer targets in the N region contained the most variations that were distributed among the most isolates, and the E region contained the least. Single nucleotide polymorphisms were the most frequent variation, with C to T transitions being detected in the most variant loci. G to A transitions and G to C transversions were the most common and had the highest isolate density. Genomic variations at the three mutation sites N: 28881, N: 28882, and N: 28883 were the most commonly detected, including in 608 SARS-CoV-2 strains from 33 countries, especially in the United Kingdom, Portugal, and Belgium.
Conclusions: Our work comprehensively analyzed genomic variations on the target sequences of the nucleic acid amplification tests, offering evidence to optimize primer and probe target sequence selection, thereby improving the performance of the SARS-CoV-2 diagnostic test.