Robust and rapid pipeline for the development of mRNA vaccine for the fast-emerging SARS-CoV-2 variants

doi:10.21203/rs.3.rs-2074769/v1

Download PDF

Article

Robust and rapid pipeline for the development of mRNA vaccine for the fast-emerging SARS-CoV-2 variants

https://doi.org/10.21203/rs.3.rs-2074769/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Recently, genomics has gained a lot of attention in the field of bio-therapeutics, vaccines, and gene therapy. We have developed a robust pipeline for establishing the mRNA vaccine portfolio to address the current pandemic and endemic of COVID-19, infectious diseases, and cancer. We have designed and assembled approximately 4000 base pair DNA encoding the spike protein of the SARS-CoV-2 Omicron variant (B.1.1.529) using overlapping oligo-based PCR assembly. Further, we cloned this fragment into a self-amplifying mRNA platform and prepared the messenger RNA (mRNA). The integrity and sequence were confirmed through multiple orthogonal techniques such as capillary electrophoresis, Sanger DNA sequencing, and Next-generation sequencing for mRNA. The mRNA was transfected into HEK 293T cells and a significant expression of the spike protein was monitored by FACS. Herein, we propose a robust and rapid pipeline for the development of mRNA for vaccines starting from sequence analysis to identifying the lead candidate within 4–5 weeks.

The introduction of recombinant DNA technology has paved a new era of applied science and to date, there are numerous products derived from biotechnology that have proven their merits in saving and improving the quality of human life¹. Needless to mention, the phenotype of any live organism is guided by its genotype, those are made of nucleic acid blocks. Advances in molecular biology have allowed us to alter the genetic make-up of the organism or introduction of heterologous DNA into the organism². Designing, synthesis, and expressing the desired DNA encoding the gene/protein of interest is the key to its success. Today, many industries perform DNA designing, and synthesis as per the user’s required specifications but most of these are localized to western countries. Though these industries offer speed, correctness, and validated supply chain management system, during any pandemic or endemic situation the timelines are stretched for these service providers as well as industries due to the unavailability of raw material, additional workload, and depleted manpower.

SARS-CoV-2 spread in late 2019 causing a state of emergency around the globe and soon with the infectivity and dissemination rate, the entire world ended up being in lockdown. The World Health Organization monitored the severity and causalities of the disease in a real-time manner and soon the spread of SARS-CoV-2 was termed a pandemic for the world. In addition to combating the rapid transfusion of the virus, the community was also facing the problem of the rapid mutation rate of the virus with increased severity and infectivity. The rapidly mutating genome sequence of the virus has alarmed the need for the development of therapeutics or vaccines on an urgent basis. The scientific community was working tirelessly at their front to develop therapeutics and vaccines to combat the COVID-19 disease as early as possible. Almost each of these studies was then dependent upon the modified, codon-optimized DNA encoding the gene of interest that are synthetically produced.

We have worked towards developing the mRNA vaccine against COVID-19 since the onset of the disease. Our mRNA portfolio has three verticals, (a) mRNA vaccine against a specific indication; (b) Neutralization assays using the pseudo-/surrogate-typed virus, that are used to access the neutralizing antibodies in the sera from vaccinated animals/humans; and (c) Recombinant proteins, that are used for the development of various immunological assays to quantify the functional immune response in the sera isolated from vaccinated animals/humans. Starting material for aforesaid verticals is the codon-optimized DNA molecule encoding for the desired gene/protein cloned into the respective plasmid.

Several oligonucleotide-based approaches for the synthesis and assembly of DNA sequences have been disclosed previously with both advantages and disadvantages but the hurdle that always stood in the way was the size of the gene³. Accounting for all these factors; we describe a simple, easy-to-perform, rapid, and cost-effective technique that allows the synthesis of a gene fragment that uses PCR to assemble the strategically designed oligonucleotides. This method relies on the principle of complementary base pairing of long oligonucleotides followed by their assembly by PCR method. The oligonucleotides are designed in sizes from 60–120 bases, with 20–30 bases of overlap between each of them and these overlapped oligonucleotides serve as both a primer and a template during the PCR procedure. For the assembly of long oligonucleotides by PCR, various factors need to be optimized including, the designing of the oligonucleotides, concentration of the oligonucleotides, the annealing temperature of the oligonucleotides, as well as the PCR cycles; all these parameters were assessed and optimized in the present study.

In this study, we are showing a robust and rapid pipeline for the development of mRNA for vaccines (Fig. 1). The pipeline starts with the extraction and analysis of the metadata of SARS-CoV-2 infected patients from the GISAID database (Global initiative on sharing all influenza data) to design the variant-specific antigenic gene encoding the spike protein of the virus⁴. We have shown the successful assembly of DNA encoding approximately 4000 base pairs gene of the spike protein of omicron variant (B.1.1.529) of the SARS-CoV-2 virus, confirmed by Sanger DNA sequencing. Later, we cloned the gene using three-fragment ligation in a plasmid DNA that could be utilized for the in-vitro transcription to produce mRNA. The mRNA integrity and the sequence were confirmed through analytical techniques such as capillary electrophoresis, and Next generation sequencing Illumina-based platform. The mRNA encoding the spike variant of the SARS-CoV-2 was transfected into HEK 293T cells and was shown to produce protein as analyzed using FACS. The developed pipeline offers an unparalleled alternative to proofread the merits of the mRNA vaccine portfolio over traditional vaccines in a shorter timeline of 4–5 weeks.

Database source and designing

The complete global metadata of 109 SARS-CoV-2 infected patients from September to December 2021 was downloaded from the database, GISAID. The aforesaid duration positions B.1.1.529 of SARS-CoV-2 a Variant of Concern (VOC) by WHO. Among the whole genomic data of the virus, the mutations occurring only in the spike region were extracted by using an in-house developed mathematical algorithm (unpublished data). Further, the sequences were analyzed for their mutations in the spike encoding region of the virus, and all the mutations occurring with a frequency of > 50% were marked in the original spike sequence of the virus (IVDC-HB-01/2019). The consensus sequence representing all the key point mutations, additions, and deletions of amino acids was reverse translated to incorporate all the preferred codons and their harmonization as per the human’s codon usage. The designed sequence has been submitted to DDBJ database, accession numbers - LC731729. During the designing of the codons, the ApaI, SacII, and NotI restriction sites were ensured to be non-cutters as the cloning was done using ApaI and SacII restriction sites into the self-amplifying mRNA platform, and the NotI site was used for the plasmid linearization before proceeding for the in-vitro transcription.

Antigen design

For the assembly of the entire coding frame of the spike protein, 158 overlapping oligonucleotides (Supplementary Table 1) were designed ranging from 60–120 bases in length with approximately 20–30 bases overlapping between each forward and reverse primer. Further, if repeat sequences were found, they were adjusted in the middle or towards the 3’ end of the oligonucleotides with overall GC content for each oligonucleotide between 45–60%. The design also took into consideration melting temperature and specificity to ensure uniform hybridization at the correct target location³. All these oligonucleotides were synthesized from Sigma Aldrich at a 1 µM scale with PAGE purification. The oligonucleotides were reconstituted in 1x TE buffer (Invitrogen, Cat. T11493) to obtain a final concentration of 100 µM and stored at -20°C until use.

Gene assembly

The detailed strategy for gene assembly and cloning is shown in Fig. 2. The long oligonucleotides were assembled by PCR using Advantage® 2 DNA polymerase mix (Takara, Cat. 639206) which is an optimized blend to obtain high fidelity and greater yield of amplified fragments of varying sizes, especially long templates up to 18 kilo-base pairs. The PCR parameters were optimized for obtaining the correct size of the sub-fragments ranging from 400–500 base pairs with minimal PCR amplification cycles to ensure the correctness of the coding sequences⁵. These 400–500 base pair sub-fragments were amplified in a way that they will be having overlap with their next sub-fragment. For the assembly of the coding frame from ApaI – spike – SacII of approximately 4000 base pairs, small fragments were marked based on the availability of unique restriction sites that were incorporated during the gene design. As shown in Fig. 2, we marked fragment A from ApaI to SgrAI restriction sites, fragment B from SgrAI to BbvCI restriction sites, fragment C from BbvCI to SnaBI restriction sites, and finally fragment D from SnaBI to SacII restriction sites. The size of each fragment from A to D was ranging from 800–1400 base pairs, however, these fragments were further split into sub-fragments to ensure the minimal amplification cycles and lesser number of oligonucleotides required for their assembly in a single reaction. Fragment A was subdivided into A1 and A2; fragment B was subdivided into B1 and B2; fragment C was subdivided into C1, C2, C3, and C4; and fragment D was subdivided into D1 and D2. The boundaries of each of these fragments and sub-fragments were labeled in a way that the amplified fragments will be having 25–30 base pair overlap with their next-in-line designed fragment and sub-fragments. Internal restriction sites were designed and chosen to ensure a backup strategy in case of any difficulty in assembling the full-length gene of approximately 4000 base pairs. It is important to remember that these sequences belong to a viral genome that is not having a uniform distribution of the nucleotides and may not be able to assemble properly. First, we amplified fragments A1, A2, B1, B2, C1, C2, C3, C4, D1, and D2; later A1-A2 were joined using the end oligonucleotides; similarly, B1-B2 were joined using their end oligonucleotides; C1-C2 and C3-C4 were joined using their end oligonucleotides, and D1-D2 were also joined using their end oligonucleotides. In a later reaction, C1-C2 and C3-C4 were also joined using their end oligonucleotides before the final assembly of all these fragments. Final PCR assembly resulted in two fragments ApaI – AB – BbvCI and BbvCI – CD – SacII.

Cloning

PCR-based assembly is highly impacted by multiple parameters such as choice of DNA polymerase enzyme, oligonucleotide size, oligonucleotide concentration, annealing temperature, and the total number of PCR cycles required for amplification⁵. Here we optimized the oligonucleotide concentration and annealing temperatures to ensure the correctness of the synthesized frame for fragment A. The finalized conditions from fragment A were then reciprocated for all the other fragments and sub-fragments. For the synthesis of the sub-fragment A1 and A2, the PCR was set up, containing 200µM of each dNTPs (Roche, Cat. 11969064001), 50x Advantage® 2 DNA polymerase mix (Takara, Cat. 639206) in 10x Advantage® 2 buffer (Takara, Cat. 639206). Three different concentration combinations of inner oligonucleotides (20 nM, 50 nM, and 200 nM) and outer oligonucleotides (100 nM, 100 nM, and 400 nM) were tested at different annealing conditions in the PCR cycle. PCR was set up in Bio-Rad Tetrad thermal cycler (398BR81) with cycle conditions as initial denaturation at 95ºC for 4 minutes; followed by 16 cycles at 95ºC for 45 seconds; annealing at 59ºC, 62ºC, 64ºC, 66ºC, 68ºC, 70ºC and 72ºC for 1 minute; extension at 68ºC for 1 minute, and final extension at 68ºC for 7 minutes 30 seconds. The optimized conditions from this section were later used for the amplification of all the sub-fragments of B, C, and D.

Further for the synthesis of fragment A, 10 µl PCR volume of A1 and A2 sub-fragments were used as a template and the amplification was performed using single forward and reverse oligonucleotide at 200nM concentration as shown in Fig. 2. The PCR cycle conditions were, initial denaturation at 95ºC for 4 minutes; followed by 28 cycles at 95ºC for 45 seconds; annealing at 59ºC, 62ºC, 64ºC, 66ºC, 68ºC, 70ºC, and 72ºC for 1 minute; extension at 68ºC for 1 minute; and final extension at 68ºC for 7 minutes 30 seconds. The optimized conditions from this section were later used for the final amplification of all the other fragments. During all these reactions, PCR products of fragments A, B, C, and D were purified using QIAquick PCR Purification Kit (Qiagen, Cat. 28104.), and concentration was determined on NanoDrop™ One/OneC Microvolume UV-Vis Spectrophotometer (ThermoFisher Scientific, Cat. ND-ONE-W).

For the final assembly, purified PCR DNA fragment AB was subjected to restriction digestion using ApaI enzyme (NEB, Cat. R0114S) and BbvCI enzyme (NEB, Cat. R0601S) and the CD fragment was subjected to restriction digestion using, BbvCI enzyme (NEB, Cat. R0601S) and SacII enzyme (NEB, Cat. R01578). pVEE 104a.1 used for cloning as a vector DNA, it is 11720 base pairs in length encoding one of the malarial antigenic proteins and also having 4 “Non-structural proteins” (nsp1-4) from Venezuelan Encephalitis alpha-virus^{6 7 8 9}. The construct also has the following components that include; UTRs which denote the untranslated regions helping in efficient translation function, nptII gene encoding the kanamycin selection marker, and pMB1 plasmid origin. Ligation of 3 digested fragments AB, CD, and pVEE 104a.1 was performed using the T4 DNA ligase enzyme (Roche, Cat. 10481220001). Transformation of the ligated DNA product was done in electro-competent E. coli DH5α cells as per the manufacturer protocol; Electromax (ThermoFisher Scientific, Cat. 11319019) and recombinants were selected on the LB agar plate containing 50 µg/mL kanamycin selection antibiotic. For positive control, pUC19 plasmid and negative control digested vector, AB, and CD fragment alone were used.

Recombinant clone screening was performed by colony PCR using plasmid backbone-specific oligonucleotides in the presence of AmpliTaq Gold DNA polymerase (Applied Biosystem^™, Cat. 4311806). Colony PCR-positive clones were inoculated into the LB medium containing 50 µg/mL kanamycin selection antibiotic. Later, plasmid DNA was isolated by the Miniprep method (PureYield^™, Promega miniprep system, Cat. A1223). The plasmid DNA was sequenced on the Sanger Automated sequencing platform (Applied Bio-systems 3500xL) using BigDye™ Terminator v3.1 Cycle Sequencing Kit (ThermoFisher Scientific, Cat no. 4337455). The sequencing data analysis was performed on the MacVector software with an assembler [18.2.5(43)]. The sequencing primers are mentioned in Supplementary Table 2.

Plasmid preparation

The circular plasmid DNA was linearized using the NotI restriction enzyme (NEB, Cat. R0189S) and purified by QIAquick PCR Purification Kit (Qiagen, Cat. 28104.). The purified plasmid DNA was quantified using the Quant-iT™ PicoGreen™ dsDNA Assay Kits and dsDNA Reagents (ThermoFisher Scientific, Cat. P7589) ¹⁰. Further, real-time quantitative PCR (RT-PCR) was employed to detect the presence of any residual E. coli genomic DNA in the plasmid preparation ¹¹. Briefly, we designed forward (5’-AAGCTGCCTGCACTAATGTTCC-3’) and reverse (5’-TCGCGTACCGTCTTCATGG-3’) primers for the amplification an amplicon of the E. coli genome. RT-PCR was performed using SsoFast™ EvaGreen® Supermix (Biorad, Cat. 1725201). Genomic DNA of E. coli DH5α isolated using GenElute™ Bacterial Genomic DNA Kits (Sigma-Aldrich: Cat no. NA 2110) was used as a reference in the concentration of 50.0 nanograms to 5.0 femtograms. The PCR was performed on the CFX96 Touch Real-Time-PCR at 98ºC for 2 minutes followed by 39 cycles at 98ºC for 5 seconds, 61.5ºC for 5 seconds, one cycle at 95ºC for 10 seconds, and a melting curve at 65ºC to 95ºC.

mRNA preparation

For mRNA preparation, in-vitro transcription was carried out using HiScribe™ T7 High Yield RNA Synthesis Kit (NEB, Cat. E2040S). The in-vitro transcribed mRNA was treated with DNase I enzyme (ThermoFisher Scientific, cat. no. EN0525) for degrading the plasmid DNA and later protected by 5’ capping using the Vaccinia Capping System (NEB, Cat. M2080S)¹². The capped mRNA was purified using LiCl precipitation¹³. Purified mRNA was quantified by Quant-it™ RiboGreen RNA Assay Kit and RiboGreen RNA Reagent (ThermoFisher Scientific, Cat. R11490)^{14 15}.mRNA integrity was confirmed by denaturing agarose gel electrophoresis and capillary electrophoresis on Agilent 5200 Fragment analyzer with a compatible reagent kit (Agilent, Cat. DNF-471 RNA kit_15nt). The purified mRNA was analyzed for the presence of plasmid DNA by RT-PCR. For this, we designed forward (5’-AACGGCTCGTAACATAGG-3’) and reverse (5’-TGGTCGAGCCAACAGAG-3’) oligonucleotide from the nsp 1 region of the pVEE101c.1 plasmid. RT-PCR reaction was performed using SsoFast™ EvaGreen® Supermix (Biorad, Cat. 1725201) on the CFX96 Touch Real-Time-PCR at 98ºC for 2 minutes followed by 39 cycles at 98ºC for 5 seconds, and 53.5ºC for 5 seconds, one cycle at 95ºC for 10 seconds and melting curve at 65ºC to 95ºC. Nucleotide sequence and Poly-A tail length of mRNA were later confirmed through sequencing by synthesis chemistry on the Illumia Miniseq platform. Briefly, mRNA was quantified using Qubit™ RNA High Sensitivity (HS), Broad Range (BR), and Extended Range (XR) Assay Kits (ThermoFisher Scientific, Cat. Q10210) on Qubit 4 Fluorometer (ThermoFisher Scientific). TruSeq® Stranded mRNA Library Prep (48 Samples) was used for the mRNA library preparation (Illumina, Cat. 20020594) along with the TruSeq RNA CD Index Plate (96 Indexes, 96 Samples) (Illumina, Cat. 20019792). The library was quantitated using Qubit™ 1X dsDNA High Sensitivity (HS) and Broad Range (BR) Assay Kits (ThermoFisher Scientific, Cat. Q33230) and library sizing was checked on Agilent 5200 Fragment analyzer system with its corresponding DNA kit (Agilent, Cat. DNF-935 reagent kit). This library was run using MiniSeq High Output Reagent Kit (300-cycles) (Illumina, Cat. FC-420-1003) to ensure the paired-end reads on the Illumina Miniseq system, which can generate the read length up to 2× 150 bp and data output of 6.6–7.5 Gb.

Bioinformatics analysis of NGS data

The raw sequencing reads were assessed for performing quality checks such as per base sequence quality, per sequence quality scores, GC content, duplication levels, etc. using the FASTQc toolkit (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Based on the FASTQc report, the reads were trimmed using a command line utility Trimmomatic-0.39 to remove the low-quality reads¹⁶. The parameters selected for trimming were LEADING:5, TRAILING:5, SLIDINGWINDOW:4:15, and MINLEN:36¹⁷.

The trimmed reads were mapped with omicron variant sequence using Bowtie2 aligner (v.2.4.5)¹⁸. Bowtie2 is based on the implementation of the Burrows-Wheeler-transform and makes use of the Full-text Minute-size index (FM-Index) for rapid processing of a large number of sequencing reads¹⁹. As a result of the mapping, a SAM file was generated by Bowtie2 that contains reads and their genomic location. The SAM file was converted into the compressed form i.e. BAM file and the reads in the BAM file were sorted based on their aligned read position in the genome using Samtools²⁰. To find the aligned reads efficiently based on a position in the genome, the index of the sorted BAM file was created as a BAI file²¹. The PCR duplicates present in the reads were removed by employing the Picard tool (http://broadinstitute.github.io/picard/). After that, an Integrated Genomics Viewer (IGV) software (v.2.8.10), a robust, user-friendly interactive tool for visualizing genomics data was used to visualize the mapped reads^{22 23}. For that, the omicron variant sequence and sorted BAM file of the sample were imported into the IGV. Finally, the pair-wise alignment of the omicron variant sequence and aligned consensus sequence of the sample obtained from IGV was performed using the Nucleotide Basic Local Alignment Search Tool (BLASTn) (http://www.ncbi.nlm.nih.gov/blast) and the identity level was checked.

Fluorescence-activated cell sorting

HEK 293T cells (ATCC-CRL-3216) at the density of 2.5 x 10⁵ cells /mL were seeded in a 6-well plate (Costar®, Cat. 3516) and were allowed to attach to the surface overnight. All the cells were cultured in a humidified incubator (ThermoFisher Scientific, Cat. 40878550, Herd Cell) set at 37°C with 5% CO₂. The next day, mRNA (10 µg) was diluted in Opti-MEM and lipofectamine 3000 in a ratio of (1:5) and incubated for 15 min. Post-incubation, the complex mixture was added to the cells. The media was changed after 24 h and cells were processed for FACS post 48 h of transfection²⁴.

HEK 293T cells were detached from the surface of a 6-well plate using cell dissociation buffer (Gibco^™, Cat. 13151-014), and both the cells and media were collected by centrifugation at 6000 rpm at 4°C for 10 min. The cells were washed thrice with ice-chilled PBS with 5% MACS BSA (Miltenyl Biotech, Cat. 130-091-376, FACS buffer) and blocked with Fc- blocker (4.5 µl, each sample) (BD, Cat. 564219, Pharmingen) for 30 min on ice. Fluorophore–conjugated anti-spike IgG (1:100) (A700) (R&D Biosystem, Cat. FAB105403N) was added to the cells and incubated for 45–60 min. Post-incubation, washing was done three times with FACS-buffer for 5 minutes and finally, the cells were analyzed using FACS (BD FACS Lyric^™). The assay was performed twice for assurance of the data and the analysis was done using the FLOWJO (v10) single cell analysis software.

Database source and antigen designing

As shown in Fig. 1, the development of mRNA started with the analysis of the metadata of the SARS-CoV-2 from infected humans. Using an in-house developed mathematical model, the key mutation identified in the spike region of the Omicron variant (B.1.1529) of SARS-CoV-2 is shown in Fig. 3.

For the gene assembly, oligonucleotide concentrations and annealing temperature for the synthesis of sub-fragments and fragments were optimized. We analyzed and identified the best-suited oligonucleotide concentration as well as the optimum annealing temperature based on the amplification yields for A1 and A2 sub-fragments (Fig. 4). We also identified the annealing temperature of the final assembly of fragment A (Fig. 5a). The optimized conditions of A1 and A2 were later used for the amplification of all the sub-fragments of B, C, and, D (Figs. 5b,5c, 5d, and 5e). Similarly, the optimized conditions of fragment A were reciprocated for the final assembly of fragments B, C, D, AB, and CD (Figs. 5f, and 5g). We tested three different concentration combinations of the inner and outer oligonucleotides, the most suited combination that worked best was 200 nM (inner) and 400 nM (outer) oligo concentration. Similarly, the best annealing temperature was 70ºC at which we synthesized the A1 and A2 sub-fragments of length 390 base pairs and 610 base pairs respectively with minimal non-specific banding pattern which was observed in all the other tested temperatures. Apart from the sub-fragments synthesis that was approx. 500–600 base pairs in length, selected annealing temperature (70ºC) also worked for Splice by Overlap Extension (SOE)-PCR that was used for the assembly of larger fragments of approximately 1000 base pairs, as observed for the assembly of A1 and A2 to generate fragment A (852 base pairs). Later, the same concentration of oligo and annealing was employed for other reactions to make initially the sub-fragment B1 (380bp), B2 (379bp); C1 (416bp), C2 (388bp), C3 (416bp), C4 (444bp); and D1 (314bp), D2 (314bp) and thereafter the assembly of these sub-fragments into final fragments by one step SOE-PCR for making B (759bp) and D (628bp) (Fig. 5). Assembly of C (1664), from 2-step SOE-PCR by making C1 + C2 fragment (804bp) and C3 + C4 fragment (860bp). According to the strategy employed, we further assembled the A and B fragments into AB (1611bp) and C and D fragments into CD (2292bp).

Cloning and Validation of the recombinant construct

Three fragment ligation was carried out to assemble the fragment AB digested by ApaI - BbvCI enzyme; fragment CD digested with BbvCI - SacII enzyme; and pVEE 104a.1 plasmid digested with ApaI – SacII enzyme (Figs. 5h, and 5i). These enzyme sites were chosen to generate the cohesive ends for directional cloning. Later, these ligated products were used electroporated into E. coli. DH5α cells that gave 2x10⁴ transformants per µg of ligated DNA in case of three fragments together and no transformants for the plasmid vector alone, AB - CD alone, AB – plasmid DNA alone, and CD – plasmid DNA alone. Recombinant clones containing the AB – CD fragment cloned into the ApaI and SacII site of the plasmid DNA named pVEE 101a.1. These recombinant clones were checked by colony PCR using the plasmid backbone specific primers for the presence of AD fragment. We tested 12 recombinant clones (C1-C12) and all of them were showing the expected size bands of approximately 4000 base pairs on the agarose gel electrophoresis (Fig. 6a). Later, these all-positive clones were further checked for the release of AD fragments of approximately 4000 base pairs by restriction digestion using ApaI and SacII restriction enzymes. All 12 clones (C1-C12) showed the desired band size release on the agarose gel (Fig. 6b). The DNA sequence of all the screening positive clones from ApaI – SacII was checked by gene-specific sequencing primers by Sanger DNA sequencing. The sequencing data of each clone was assembled and checked for the sequence identity with the plasmid DNA sequence using MacVector software with assembler [18.2.5(43)]. C6 clone showed complete identity with the reference sequence and was selected for future work.

Plasmid and mRNA preparation

For the mRNA generation by in vitro transcription, we prepared the large-scale plasmid DNA of C6 of pVEE 101a.1that was later linearized with the NotI enzyme (Fig. 7a). This linearization was essential for the IVT process, and for the T7 RNA polymerase to transcribe the DNA into mRNA. Genomic DNA presence may affect the mRNA yield by in-vitro transcription process, RT-PCR was performed for the detection of genomic DNA contamination in pVEE 101c.1 plasmid DNA preparation. The results met our specification of less than 50 pg of genomic DNA in 1 mg of pVEE 101c.1 plasmid DNA (Figs. 7b, and 7c). In-vitro transcription followed by capping and purification generated 0.9 mg/mL of mRNA. The mRNA sample when loaded on the agarose gel showed a clear band at the desired position of approximately 11700 bases (Fig. 7d). Plasmid DNA presence in the mRNA may affect the downstream applications, thus we quantitated the plasmid DNA contamination in the mRNA by RT-PCR using the plasmid backbone-specific oligonucleotides. Our results met the specification of less than 10 pg of plasmid DNA in 1 mg of mRNA (Figs. 7e, and 7f). We also checked the mRNA profile by capillary electrophoresis on Agilent 5200 fragment analyzer. This data showed the mRNA intactness in the form of the peak at the anticipated size (Fig. 7g). The mRNA sequence of the entire cassette including the poly-A tail was checked by Next Generation Sequencing on Illumina Miniseq. The fragment library for the NGS run was analyzed by capillary electrophoresis on Agilent 5200 fragment analyzer. The data showed the fragment distribution from approximately 200–400 base pairs, which is compatible with the Illumina paired-end sequencing (Fig. 8a). The NGS data retrieved from the Miniseq system showed a yield of 10.5 Gb with an optimum cluster density of 282K/mm², and a percentage cluster passing filter of 79.7%, which matches the specifications of Illumina (Fig. 8b). The Q30 passed data was 88.75% which indicates a good quality reliable dataset for further analysis. The cluster density, and data output was slightly higher as per the specifications from the manufacturer protocol that suggest 160-220K/mm² cluster density and 7.5 Gb output data.

mRNA sequence analysis by NGS

According to the FASTQc report, low-quality reads and PCR duplicates were observed in the sample that was processed with Trimmomatic and PICARD respectively. After mapping the processed reads with omicron variant sequence using Bowtie2, the mapped reads were visualized with IGV. As a result of pair-wise sequence alignment of the consensus sequence was obtained from IGV with omicron variant sequence with BLASTn with the query coverage of 99%.

In-vitro expression and quantitation by FACS

The mRNA encoding the spike protein was transfected in HEK 293T cells and the expression was monitored by FACS assay (Fig. 9A). The analysis of the data showed that 60% of total HEK 293T cells were positive for anti-spike IgG staining as compared to the non-transfected cells (Fig. 9B). Hence the prepared mRNA was successfully encoding for the spike protein and the mRNA construct prepared through the gene assembly was functional.

The proposed work herein, which states the development of mRNA vaccine from the oligonucleotide-based gene assembly through PCR, has indeed proven its strength and efficacy due to its successful cloning and mRNA synthesis followed by its in-vitro expression studies as confirmed and validated by various analytical techniques. This work allows us to compare and contrast the advantages and disadvantages of this stated gene synthesis and cloning method with the traditional ones²⁵ and select the best suited for the desired purpose. The mRNA vaccine development for COVID-19 was commenced by us in late 2019 due to the severe infectivity of the virus that lead to the pandemic situation worldwide causing a global lockdown to curtail the SARS-CoV-2 spread, the causative agent for the disease ²⁶. Knowing the fact that the starting material for the development of a subunit vaccine of protein or nucleic acid is the DNA encoding the antigenic protein sequence, highlights the importance of designing and cloning the correct gene cassette along with its optimal regulatory elements for generating the right efficacious vaccines²⁷. This demand is fulfilled by many CROs today, that offer the service of gene synthesis and cloning as per the user requirement for many further downstream applications. This being their primary responsibility, they have gained immense expertise in the work of developing high throughput pipeline for gene design and synthesis to support the global requirement with the use of algorithms to generate the most translatable codons as per the host requirement ²⁸. These methods of molecular cloning are undoubtedly very robust with most assured results, but there are certain disadvantages that follow such as, a long timeline of 4–6 weeks for the gene synthesis and its delivery followed by the rest of the work conducted by the industry which adds another 4–6 weeks until the final confirmation of the recombinant construct. Apart from this sharing of the confidential sequences which form a part of IP, biosafety concerns, high cost, and other regulatory requirements altogether sometimes obstruct the development process in the industry, especially during any pandemic or endemic occurrence that disturbs this balance between the demand and supply.

During the outbreak of the SARS-CoV-2 pandemic, it was a challenge for us to target the rapidly emerging variants of this virus with constantly rising infectivity and death rate. Being the pioneers of mRNA-based vaccine development in India and knowing the easy tweaking ability of the mRNA as a vaccine candidate, we have developed a need-based rapid and robust pipeline starting from downloading the genome sequences of virus from patients infected with SARS-CoV-2 omicron variant, analysing the data, designing the gene encoding the antigenic protein, cloning, cell banking, plasmid and mRNA preparation, and finally the functional expression of the antigenic protein (Fig. 1). Due to the quick timeline of the entire aforementioned process, we were able to develop a SARS-CoV-2 omicron variant-specific mRNA vaccine candidate in 3–4 weeks post the disease onset. The developed pipeline has been validated to generate the spike surface protein encoded by a longer gene cassette of approximately 4000 base pairs in length by PCR-based assembly method. This PCR method described in our study has strategically divided the entire gene cassette into four major fragments, sequentially assembled each by oligonucleotides, and finally assembled and cloned the full-length gene by three-fragment ligation into the desired vector DNA that encodes the Spike protein of omicron variant; SARS-CoV-2 (BA.1.159). We are operating currently on the self-amplifying mRNA platform, which makes it approximately 14000 base pair in size with 4000 base pair spike and 7000 base pair that occupies the non-structural protein elements for replication, and approximately 11700 base pair of mRNA after transcription making it challenging due to high sizing range of both DNA and RNA. We used the vaccinia capping enzyme for the addition of Cap-0 at the 5’ end of the in-vitro transcribed mRNA, that also comprises of a poly-A sequence at the 3’ end that is coming from the plasmid DNA itself after in-vitro transcription, eliminating the need for an additional step of poly-A addition by the poly-A polymerase. To minimize and limit the failure of any process or in process product, we have inbuilt certain critical quality checks in our pipeline after each major step to assure the results and to assess them before the final validation. Next Generation Sequencing of the mRNA cassette, Sanger DNA sequencing, RT-PCR were the primary assessment techniques used to ensure the correctness of the coding from 5’ UTR to the poly-A tail, along with the absence of plasmid DNA as measured from RT-PCR.

The efficient expression of this large-size spike protein as detected via the FACS indicates that the spike protein was displayed on the surface of the cells. The injection of this mRNA construct into the mice model might potentially induce the immunogenic response and hence the prepared construct have the potential to be a successful vaccine candidate.

To conclude, in this study, we propose a robust, amenable, cost-effective pipeline for the rapid development of mRNA as a vaccine candidate starting from designing, cloning, and establishing the potency of the expressed protein in an in-vitro system. The seamless and cost-effective approach of cloning the construct into the mRNA platform and assessing their immunogenic response will act as a platform for gene designing and vaccine development for any disease.

ACKNOWLEDGEMENT

We want to thank the Department of Biotechnology of India (BT/COVID0041/01/20) for partial financial support.

AUTHOR CONTRIBUTIONS

GS, RK, SS, SP, and AS designed the experiments. GS, RK, SS, and SP, performed the experiments. GS, RK, SS, SP, RW, AS, and SSingh analyzed the data and all the authors participated in manuscript preparation.

COMPETITING INTERESTS

The authors declare no competing interests.

DATA AVAILIBILITY

The datasets generated and/or analyzed during the current study are available in the DNA Databank of Japan (DDBJ) repository, accession number is LC731729. The original version of figure 5b, 5c, 5d, 5e, and 5f representing the ethidium bromide stained agarose gel are shown in supplementary figure 1-5. All data presented in the study are included in the manuscript as Figures, tables and supplementary material. The unpublished data of the study is available on request from the corresponding author Ajay Singh, Ph.D. ([email protected] and [email protected]).

Evens, R. & Kaitin, K. The evolution of biotechnology and its impact on health care. Health Aff. 34, (2015).
Xiong, A.-S. et al. A simple, rapid, high-fidelity and cost-effective PCR-based two-step DNA synthesis method for long gene sequences. Nucleic Acids Res. 32, e98 (2004).
Rouillard, J.-M. et al. Gene2Oligo: oligonucleotide design for in vitro gene synthesis. Nucleic Acids Res. 32, W176-80 (2004).
Shu, Y. & McCauley, J. GISAID: Global initiative on sharing all influenza data - from vision to reality. Euro Surveill. 22, (2017).
Sequeira, A. F., Brás, J. L. A., Guerreiro, C. I. P. D., Vincentelli, R. & Fontes, C. M. G. A. Development of a gene synthesis platform for the efficient large scale production of small genes encoding animal toxins. BMC Biotechnol. 16, 86 (2016).
Lundberg, L., Carey, B. & Kehn-Hall, K. Venezuelan Equine Encephalitis Virus Capsid-The Clever Caper. Viruses 9, (2017).
Li, C. et al. mRNA Capping by Venezuelan Equine Encephalitis Virus nsP1: Functional Characterization and Implications for Antiviral Research. J. Virol. 89, 8292–8303 (2015).
Kim, D. Y., Atasheva, S., Frolova, E. I. & Frolov, I. Venezuelan equine encephalitis virus nsP2 protein regulates packaging of the viral genome into infectious virions. J. Virol. 87, 4202–4213 (2013).
Amaya, M. et al. Venezuelan equine encephalitis virus non-structural protein 3 (nsP3) interacts with RNA helicases DDX1 and DDX3 in infected cells. Antiviral Res. 131, 49–60 (2016).
Singer, V. L., Jones, L. J., Yue, S. T. & Haugland, R. P. Characterization of PicoGreen reagent and development of a fluorescence-based solution assay for double-stranded DNA quantitation. Anal. Biochem. 249, 228–238 (1997).
Vilalta, A., Whitlow, V. & Martin, T. Real-time PCR determination of Escherichia coli genomic DNA contamination in plasmid preparations. Anal. Biochem. 301, 151–153 (2002).
Shuman, S. Catalytic activity of vaccinia mRNA capping enzyme subunits coexpressed in Escherichia coli. J. Biol. Chem. 265, 11960–11966 (1990).
Sambrook, J. & Russell, D. W. Molecular Cloning: A Laboratory Manual. vol. Vol. 1 (Cold Spring Harbor Laboratory Press, New York., 2001).
Sambrook, J. & Russell, D. W. Molecular Cloning: A Laboratory Manual.
Jones, L. J., Yue, S. T., Cheung, C. Y. & Singer, V. L. RNA quantitation by fluorescence-based solution assay: RiboGreen reagent characterization. Anal. Biochem. 265, 368–374 (1998).
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Neuparth, T. et al. Transcriptomic data on the transgenerational exposure of the keystone amphipod Gammarus locusta to simvastatin. Data Br. 32, 106248 (2020).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Hanussek, M., Bartusch, F. & Krüger, J. Performance and scaling behavior of bioinformatic applications in virtualization environments to create awareness for the efficient use of compute resources. PLoS Comput. Biol. 17, e1009244 (2021).
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Veneman, W. J. et al. Analysis of RNAseq datasets from a comparative infectious disease zebrafish model using GeneTiles bioinformatics. Immunogenetics 67, 135–147 (2015).
Thorvaldsdóttir, H., Robinson, J. T. & Mesirov, J. P. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief. Bioinform. 14, 178–192 (2013).
Wu, Z.-W. et al. Global 3’-untranslated region landscape mediated by alternative polyadenylation during meiotic maturation of pig oocytes. Reprod. Domest. Anim. 57, 33–44 (2022).
Avci-Adali, M. et al. Optimized conditions for successful transfection of human endothelial cells with in vitro synthesized and modified mRNA for induction of protein expression. J. Biol. Eng. 8, 8 (2014).
Hughes, R., Miklos, A. & Ellington, A. Gene Synthesis. (2022).
Andrews, M. et al. First confirmed case of COVID-19 infection in India: A case report. (2022).
Bramwell, V. & Perrie, Y. The rational design of vaccines. (2022).
Quax, T., Claassens, N., Söll, D. & van der Oost, J. Codon Bias as a Means to Fine-Tune Gene Expression. (2022).

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

Robust and rapid pipeline for the development of mRNA vaccine for the fast-emerging SARS-CoV-2 variants

Status:

Version 1

Abstract

Figures

Introduction

Materials And Methods

Database source and designing

Antigen design

Gene assembly

Cloning

Plasmid preparation

mRNA preparation

Bioinformatics analysis of NGS data

Fluorescence-activated cell sorting

Results

Database source and antigen designing

Cloning and Validation of the recombinant construct

Plasmid and mRNA preparation

mRNA sequence analysis by NGS

In-vitro expression and quantitation by FACS

Discussion

Declarations

References

Additional Declarations

Supplementary Files

Status:

Version 1