A total of 47 SARS-CoV-2 positive samples with real-time PCR Ct values ranging from 14 to 30 were sequenced with three different protocols for SARS-CoV-2 sequencing, involving enrichment by bait hybridization or tiled amplicons. A schematic overview including library preparation, target enrichment, and sequencing is shown in Fig. 1. All 47 samples were run with Twist Bioscience bait hybridization enrichment, henceforth referred to as Twist, and ARTIC V3 tiled amplicon enrichment, henceforth referred to as ARTIC. For Midnight tiled amplicon enrichment, henceforth referred to as Midnight, only 24 of the original 47 samples were run because the rest of the RNA samples had been sent to external laboratories for national surveillance analyses or had been discarded. Twist had an estimated time for library construction of around 24–38 h, varying in time depending on selected hybridization time: 2–16 h. ARTIC had an estimated time for library construction of approximately 9 h and Midnight approximately 7 h. Midnight library preparation had the shortest sequencing time set to run for 12 h, performed on the ONT GridIon. Sequencing time on the Illumina MiSeq was 21 h for Twist and 39 h for ARTIC. Data output was adjustable, and depended on the chosen sequencing platforms and kits.
Comparison of mean depth and uniformity
Sequencing metrics across methods are shown in Supplementary Table 1. Twist samples had the highest total reads, with a mean of 6,630,000 reads and median depth of 1,313x (range: 4–60,104x). ARTIC samples had a mean of 745,000 total reads and a median depth of 2,724x (range: 381–4,611x). Midnight samples had a mean of 160,000 total reads and median depth of 930x (range: 12–3,322x). The mean sequencing coverage depth of the SARS-CoV-2 genome, downsampled to 1000x coverage for each method with raw medians, is shown in Fig. 2. Evenness was measured with the fold-80 base penalty, a metric to estimate uniformity which ideally is equal to 1. Using Twist, positions 11288–11296 and 21766–21770 had the lowest read depth coverage (23–45x and 42–287x, respectively). Twist had a fold-80 median of 1.47 (range: 1.22–2.4). ARTIC resulted in relatively low coverage in the open reading frame encoding the spike protein in position 22339–22523 (coverage < 23x); this might have been due to primer specificity and even amplification for all amplicons. ARTIC had a fold-80 base penalty median of 1.81 (range: 1.49–4.66). Peaks with high coverage were due to overlapping primer regions or improved amplicon genome recovery, seen in region 17 Mbp with ARTIC. Midnight exhibited relatively even coverage across all 29 amplicons, with no obvious dropouts, and had a fold-80 base penalty median of 2.08 (range: 1.6–14.1).
Comparison of viral load and genome coverage
The relations between reads mapped to SARS-CoV-2 and Ct values are shown in Fig. 3 and Supplementary Fig. 1. The first 10 samples in batches A–B of Twist were combined regardless of Ct values (Supplementary Fig. 1), leading to samples with high viral load (Ct < 18) having an excessively high amount of total reads (> 4.5 M). The rest of the Twist samples were sequenced with similar Ct range, which gave slightly less variation in the number of total reads and an even fraction distribution with a tendency toward a decreased fraction of SARS-CoV-2 at higher Ct values. In addition, 2/10 samples in batch A-B were unquantifiable, contra 33/37 for batch C-G. No batch effect was seen in the ARTIC or Midnight samples (Supplementary Fig. 1), where the majority had a fraction > 0.75. The 24 Midnight samples were either from the same extraction used for the other methods (original extraction), from a dilution of the original extraction, or from a new extraction, as indicated in Supplementary Fig. 1.
Intra-individual performance
Sequencing results revealed differences in the fraction of SARS-CoV-2 specific reads. Twist had a median of 30% (range: ≤0.001–99.7%), ARTIC had a median of 99.6% (range: 54.6–99.9%), and Midnight had a median of 94% (range: 7.8–96.8%); these differences were not related to viral load (Fig. 3). The fractions of SARS-CoV-2 reads found with different methods within the same patient are shown in Fig. 3. ARTIC had high fraction of SARS-CoV-2 mapped reads. Twist fractions dropped from 0.99 to 0.001, with lowest fractions seen for the same samples as ARTIC and Midnight, which generally had high Ct values.
Host content
Human background and SARS-CoV-2 capture enrichment efficiency in relation to viral load is shown in Fig. 4. Twist showed the highest host content, with a mean fraction of 0.46 of raw reads mapped to the human host genome reference (hg19). A small tendency toward increasing human and discarded unmapped reads was seen in samples with Ct values ≥ 27. Overall, both amplicon-based methods showed high SARS-CoV-2 fraction and low host content, with a few exceptions (Fig. 4). ARTIC showed low host content, with a mean fraction of 0.024 human reads. Midnight had a mean fraction of 0.087 human reads.
Comparison of complete genome consensus sequences
Pangolin and Nextclade assessment and quality control metrics were generated from consensus sequences of SARS-CoV-2 (Table 1). All Pangolin and Nextclade designations were in concordance across methods when reads were sufficient for variant classification. The majority of genetic variants that differed from the Wuhan reference were located in the ORF1ab gene and in the S gene, and consisted of substitutions, a few deletions, and one frameshift. For the Twist protocol, 42/47 samples were classified using Pangolin and 46/47 samples were classified using Nextclade. Unclassified samples all had Ct > 25. For successful variant determination from the Twist protocol, 1.5 M SARS-CoV-2 specific reads were needed. The five unclassifiable samples had < 0.65 M SARS-CoV-2 specific reads, and quality control status was bad or mediocre (indicated by “qc.overall” in Table 1). Pangolin and Nextclade successfully classified all 47/47 ARTIC samples. Patient ID 17, with low viral load (Ct value 28), could be classified using the ARTIC protocol but not with Twist or Midnight. Using the Midnight protocol, Pangolin was able to classify 18/24 samples and Nextclade classified 21/24 samples.
Table 1
Clade and linage assigned SARS-CoV-2 variants. Patient ID, Ct value, Pangolin lineage (Pango), Nextstrain clade (Clade) and quality metrics for Twist bait hybridization enrichment and ARTIC V3 and Midnight tiled amplicon enrichment.
|
|
Twist (Illumina)
|
ARTIC (Illumina)
|
Midnight (ONT)
|
Ct
|
ID
|
Pango
|
Clade
|
qc.overall
|
qc.missing
|
Pango
|
Clade
|
qc.overall
|
qc.missing
|
Ct
|
Pango
|
Clade
|
qc.overall
|
qc.missing
|
14
|
22
|
B.1.177
|
20E (EU1)
|
good
|
good
|
B.1.177
|
20E (EU1)
|
good
|
good
|
14
|
None
|
None
|
None
|
None
|
15
|
23
|
B.1.351
|
20H (Beta, V2)
|
good
|
good
|
B.1.351
|
20H (Beta, V2)
|
mediocre
|
good
|
n/a
|
n/a
|
n/a
|
n/a
|
n/a
|
15
|
24
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
25b
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
15
|
6
|
B.1
|
20A
|
good
|
good
|
B.1
|
20A
|
good
|
good
|
15a
|
B.1
|
20A
|
good
|
good
|
16
|
25
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
27b
|
None
|
20I (Alpha, V1)
|
bad
|
bad
|
17
|
19
|
B.1.177.82
|
20E (EU1)
|
good
|
good
|
B.1.177.82
|
20E (EU1)
|
good
|
good
|
17
|
B.1.177.82
|
20E (EU1)
|
good
|
good
|
17
|
26
|
B.1.36.1
|
20A
|
good
|
good
|
B.1.36.1
|
20A
|
good
|
good
|
n/a
|
n/a
|
n/a
|
n/a
|
n/a
|
17
|
7
|
B.1
|
20A
|
good
|
good
|
B.1
|
20A
|
good
|
good
|
17a
|
B.1
|
20A
|
good
|
good
|
18
|
27
|
B.1.36.1
|
20A
|
good
|
good
|
B.1.36.1
|
20A
|
good
|
good
|
n/a
|
n/a
|
n/a
|
n/a
|
n/a
|
18
|
28
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
18
|
None
|
20I (Alpha, V1)
|
bad
|
bad
|
18
|
1
|
B.1
|
20C
|
good
|
good
|
B.1
|
20C
|
good
|
good
|
18a
|
B.1
|
20C
|
good
|
good
|
20
|
30
|
B.1.177.46
|
20E (EU1)
|
good
|
good
|
B.1.177.46
|
20E (EU1)
|
good
|
good
|
n/a
|
n/a
|
n/a
|
n/a
|
n/a
|
20
|
31
|
B.1.36.1
|
20A
|
good
|
good
|
B.1.36.1
|
20A
|
good
|
good
|
n/a
|
n/a
|
n/a
|
n/a
|
n/a
|
21
|
9
|
B.3
|
19A
|
good
|
good
|
B.3
|
19A
|
good
|
good
|
21a
|
None
|
None
|
None
|
None
|
21
|
8
|
B.1
|
20A
|
good
|
good
|
B.1
|
20A
|
good
|
good
|
21a
|
B.1
|
20A
|
good
|
good
|
21
|
32
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
n/a
|
n/a
|
n/a
|
n/a
|
n/a
|
21
|
33
|
B.1.36.1
|
20A
|
good
|
good
|
B.1.36.1
|
20A
|
good
|
good
|
25b
|
B.1.36.1
|
20A
|
mediocre
|
good
|
22
|
34
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
25b
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
22
|
35
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
B.1.1.7
|
20I (Alpha, V1)
|
mediocre
|
good
|
n/a
|
n/a
|
n/a
|
n/a
|
n/a
|
22
|
5
|
B.1
|
20A
|
good
|
good
|
B.1
|
20A
|
good
|
good
|
22a
|
B.1
|
20A
|
bad
|
bad
|
23
|
36
|
B.1.351
|
20H (Beta, V2)
|
mediocre
|
good
|
B.1.351
|
20H (Beta, V2)
|
mediocre
|
good
|
n/a
|
n/a
|
n/a
|
n/a
|
n/a
|
24
|
2
|
B.1.44
|
20C
|
good
|
good
|
B.1.44
|
20C
|
good
|
good
|
24a
|
B.1.44
|
20C
|
good
|
mediocre
|
24
|
15
|
B.1.36.1
|
20A
|
good
|
good
|
B.1.36.1
|
20A
|
good
|
good
|
24
|
B.1.36.1
|
20A
|
mediocre
|
good
|
24
|
37
|
B.1.1.7
|
20I (Alpha, V1)
|
mediocre
|
good
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
24
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
24
|
38
|
B.1.1.7
|
20I (Alpha, V1)
|
bad
|
good
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
n/a
|
n/a
|
n/a
|
n/a
|
n/a
|
24
|
39
|
B.1.351
|
20H (Beta, V2)
|
mediocre
|
good
|
B.1.351
|
20H (Beta, V2)
|
mediocre
|
good
|
n/a
|
n/a
|
n/a
|
n/a
|
n/a
|
24
|
45
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
25b
|
B.1.1.7
|
20I (Alpha, V1)
|
bad
|
bad
|
25
|
18
|
B.1.258
|
20A
|
bad
|
mediocre
|
B.1.258
|
20A
|
good
|
good
|
25
|
B.1.258
|
20A
|
mediocre
|
mediocre
|
25
|
40
|
None
|
20A
|
bad
|
bad
|
B.1.36.1
|
20A
|
good
|
good
|
n/a
|
n/a
|
n/a
|
n/a
|
n/a
|
25
|
41
|
B.1.1.7
|
20I (Alpha, V1)
|
mediocre
|
good
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
n/a
|
n/a
|
n/a
|
n/a
|
n/a
|
25
|
42
|
B.1.177.82
|
20E (EU1)
|
good
|
good
|
B.1.177.82
|
20E (EU1)
|
good
|
good
|
25b
|
B.1.177.82
|
20E (EU1)
|
good
|
good
|
25
|
43
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
26b
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
25
|
44
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
B.1.1.7
|
20I (Alpha, V1)
|
mediocre
|
good
|
26b
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
25
|
4
|
B.1
|
20C
|
good
|
good
|
B.1
|
20C
|
good
|
good
|
n/a
|
n/a
|
n/a
|
n/a
|
n/a
|
26
|
10
|
B.1
|
20C
|
good
|
good
|
B.1
|
20C
|
good
|
good
|
n/a
|
n/a
|
n/a
|
n/a
|
n/a
|
27
|
46
|
B.1.351
|
20H (Beta, V2)
|
mediocre
|
good
|
B.1.351
|
20H (Beta, V2)
|
mediocre
|
good
|
n/a
|
n/a
|
n/a
|
n/a
|
n/a
|
27
|
47
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
n/a
|
n/a
|
n/a
|
n/a
|
n/a
|
27
|
48
|
None
|
None
|
None
|
None
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
n/a
|
n/a
|
n/a
|
n/a
|
n/a
|
27
|
49
|
B.1.177.46
|
20E (EU1)
|
mediocre
|
mediocre
|
B.1.177.46
|
20E (EU1)
|
good
|
good
|
n/a
|
n/a
|
n/a
|
n/a
|
n/a
|
27
|
3
|
B.1
|
20C
|
good
|
good
|
B.1
|
20C
|
good
|
good
|
n/a
|
n/a
|
n/a
|
n/a
|
n/a
|
28
|
16
|
B.1.177.82
|
20E (EU1)
|
good
|
good
|
B.1.177.82
|
20E (EU1)
|
good
|
good
|
28
|
None
|
20E (EU1)
|
bad
|
bad
|
28
|
17
|
None
|
20E (EU1)
|
bad
|
bad
|
B.1.177.86
|
20E (EU1)
|
good
|
good
|
28
|
None
|
None
|
None
|
None
|
28
|
50
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
n/a
|
n/a
|
n/a
|
n/a
|
n/a
|
28
|
51
|
None
|
20I (Alpha, V1)
|
bad
|
bad
|
B.1.1.7
|
20I (Alpha, V1)
|
good
|
good
|
n/a
|
n/a
|
n/a
|
n/a
|
n/a
|
28
|
52
|
B.1.36.1
|
20A
|
good
|
good
|
B.1.36.1
|
20A
|
good
|
good
|
n/a
|
n/a
|
n/a
|
n/a
|
n/a
|
28
|
53
|
None
|
20B
|
bad
|
bad
|
B.1.1.141
|
20B
|
good
|
good
|
n/a
|
n/a
|
n/a
|
n/a
|
n/a
|
29
|
14
|
B.1.160
|
20A
|
good
|
good
|
B.1.160
|
20A
|
good
|
good
|
29
|
B.1.160
|
20A
|
mediocre
|
mediocre
|
n/a, not applicable |
a diluted |
b Ct value after new extraction |
The phylogeny, Nextclade, and Pangolin lineages for the consensus sequences are shown in Fig. 5. Patient IDs are in chronological order, starting with sample 1 collected on 1 July 2020 and ending with sample 53 collected on 1 April 2021. The dominance of Pango B.1 in the early pandemic (8 patients) corresponds to the Northern Italian outbreak early in 2020, which was dominant in Sweden at the beginning of the pandemic. The other Pango lineages (Fig. 5) were B.1.1.7 (UK lineage/Alpha; 14 patients), B1.36.1 (European/Denmark lineage; 6 patients), 1.351 (South Africa lineage/Beta; 4 patients) and B1.177.82 (Scandinavian lineage; 3 patients), representing the dominant lineages in Sweden during the period of collection.
While Twist generated the complete genome, ARTIC and Midnight showed lack of coverage at the start, end and, mid sections of the SARS-CoV-2 Wuhan reference genome. Twist had the most complete coverage in the first ~ 60 bp. Midnight and ARTIC read coverage started at position 55 in the Wuhan reference and ended at position 29,786 of the alignment with Midnight and position 29,854 of the alignment with ARTIC, due to the primer design (Supplementary Fig. 1). Lack of coverage in the mid sections of the SARS-CoV-2 genome, mainly with the Midnight protocol, is indicated by an entry of “bad” in the “qc.missing” column of Table 1.