In the present study we demonstrate that by adapting the RT and PCR cycling conditions we can increase the sensitivity of recovering IAV whole genome sequences by approximately 1000-fold compared to the original multisegment RT-PCR (mRT-PCR) protocol for IAV. Furthermore, we show that with the inclusion of barcoded primers we can multiplex at least eight clinical samples of avian, swine, or human origin per sequencing library preparation without a significant loss in sensitivity. Collectively, this results in an enhanced workflow for high-throughput whole genome surveillance of influenza A virus in various host species.
Here, we demonstrate that dual-barcoding is possible on the ONT sequencing platform with up to eight samples per native ONT sequencing barcode library. Based on the number of barcodes available for the ONT PCR barcoding expansion kit (EXP-PBC096), this can, in principle, be expanded to 96 samples per native ONT sequencing barcode library. However, in our experimental setting we did not quantify individual barcoded samples, but only the pool of samples to ensure that that each native ONT sequencing barcode library was made with 200 fmol. Therefore, including more barcodes could dilute out samples from which the amplicon concentration is low. To overcome this, one needs to quantify individual barcoded samples, normalize them during pooling, and empirically determine the maximum number of individual barcoded samples that can be pooled per native ONT sequencing barcode library. This likely will further reduce the number of native ONT sequencing barcode libraries needed, without compromising full genome recovery detection sensitivity, but would be more laborious compared to the “blinded” pooling approach described in the current study, and alternatively can be resolved with a deeper sequencing depth. Finally, although the dual-barcoding approach has thus far only been evaluated for WGS, it is conceivable to adapt this methodology to other multisegment and multiplex approaches for genomic surveillance of seasonal human influenza A and B viruses 23,25.
The multisegment RT-PCR is an essential tool to perform whole genome sequencing of Influenza A viruses (IAVs). We show that our optimized protocol can be used to sequence IAV-positive samples from avian, swine, and human origin up to a Cp-value of approximately 34. However, we observed that regardless of the barcoding strategy, the samples of avian origin had a reduced genomic recovery compared to the samples of swine or human origin, which had even lower viral loads than the avian IAV samples (Fig. 2C). This discrepancy suggests that sample origin can significantly influence detection sensitivity. However, we cannot exclude that this might be influenced by the fact that the avian and swine samples, in contrast to the human samples, underwent several freeze/thaw cycles prior to the experiments, which can negatively affect RNA quality and integrity. Similarly, the avian samples were extracted manually with a column-based kit, whereas the human and swine samples were extracted using an automated magnetic bead-based method, resulting in the avian samples having a lower sequencing sensitivity, which aligns with a previous report 31. Moreover, given that influenza A viruses cause acute infections, the time of sampling in relation to the time of infection is important, as towards the end of the illness the viral load in clinical samples decreases 32,33. Therefore, it would be interesting to evaluate our protocol in a more controlled study to determine whether full genome recovery rates are influenced by the sample processing as well as the time of sampling.
Because the exact role of defective viral genomes (DVGs) in vivo and in vitro during virus infection remains largely elusive 27, there have been several short-read based bioinformatic pipelines established to analyse this phenomenon 29,30. Thus, while the detection of DVGs for IAV is not novel per se, we demonstrate that in addition to recovering whole genome information our WGS approach can also be used to detect IAV DVGs in clinical specimens, without any potential fragment partitioning during sequence library preparation 30 However, because DVGs of PB1, PB2, and PA are on average 400–500 nt in size, and HA, NA, NP, M, and NS DVGs are on average around 400 nt, it is possible that the size-selection step of our protocol prior to sequencing depletes some of the DVG amplicons shorter than 500 bp. This likely reflects why most DVGs we detected originated from the largest genomic segments of IAV (e.g., PB1, PB2, and PA) (Figure S2). Therefore, it remains to be determined if our approach can reliably detect DVGs of the small genomic segments of IAV, as well as how long-read NGS methodologies compare to short-read NGS approaches.
In conclusion, the optimized and scalable whole genome sequencing workflow for influenza A virus, combined with the availability of a portable third-generation sequencing platform, enables genomic sequence surveillance at the human-animal interface. This significantly enhances sensitivity and throughput, facilitating the early detection and monitoring of IAV evolution and zoonotic spillovers in clinical samples from diverse origins.