Database source and designing
The complete global metadata of 109 SARS-CoV-2 infected patients from September to December 2021 was downloaded from the database, GISAID. The aforesaid duration positions B.1.1.529 of SARS-CoV-2 a Variant of Concern (VOC) by WHO. Among the whole genomic data of the virus, the mutations occurring only in the spike region were extracted by using an in-house developed mathematical algorithm (unpublished data). Further, the sequences were analyzed for their mutations in the spike encoding region of the virus, and all the mutations occurring with a frequency of > 50% were marked in the original spike sequence of the virus (IVDC-HB-01/2019). The consensus sequence representing all the key point mutations, additions, and deletions of amino acids was reverse translated to incorporate all the preferred codons and their harmonization as per the human’s codon usage. The designed sequence has been submitted to DDBJ database, accession numbers - LC731729. During the designing of the codons, the ApaI, SacII, and NotI restriction sites were ensured to be non-cutters as the cloning was done using ApaI and SacII restriction sites into the self-amplifying mRNA platform, and the NotI site was used for the plasmid linearization before proceeding for the in-vitro transcription.
Antigen design
For the assembly of the entire coding frame of the spike protein, 158 overlapping oligonucleotides (Supplementary Table 1) were designed ranging from 60–120 bases in length with approximately 20–30 bases overlapping between each forward and reverse primer. Further, if repeat sequences were found, they were adjusted in the middle or towards the 3’ end of the oligonucleotides with overall GC content for each oligonucleotide between 45–60%. The design also took into consideration melting temperature and specificity to ensure uniform hybridization at the correct target location3. All these oligonucleotides were synthesized from Sigma Aldrich at a 1 µM scale with PAGE purification. The oligonucleotides were reconstituted in 1x TE buffer (Invitrogen, Cat. T11493) to obtain a final concentration of 100 µM and stored at -20°C until use.
Gene assembly
The detailed strategy for gene assembly and cloning is shown in Fig. 2. The long oligonucleotides were assembled by PCR using Advantage® 2 DNA polymerase mix (Takara, Cat. 639206) which is an optimized blend to obtain high fidelity and greater yield of amplified fragments of varying sizes, especially long templates up to 18 kilo-base pairs. The PCR parameters were optimized for obtaining the correct size of the sub-fragments ranging from 400–500 base pairs with minimal PCR amplification cycles to ensure the correctness of the coding sequences5. These 400–500 base pair sub-fragments were amplified in a way that they will be having overlap with their next sub-fragment. For the assembly of the coding frame from ApaI – spike – SacII of approximately 4000 base pairs, small fragments were marked based on the availability of unique restriction sites that were incorporated during the gene design. As shown in Fig. 2, we marked fragment A from ApaI to SgrAI restriction sites, fragment B from SgrAI to BbvCI restriction sites, fragment C from BbvCI to SnaBI restriction sites, and finally fragment D from SnaBI to SacII restriction sites. The size of each fragment from A to D was ranging from 800–1400 base pairs, however, these fragments were further split into sub-fragments to ensure the minimal amplification cycles and lesser number of oligonucleotides required for their assembly in a single reaction. Fragment A was subdivided into A1 and A2; fragment B was subdivided into B1 and B2; fragment C was subdivided into C1, C2, C3, and C4; and fragment D was subdivided into D1 and D2. The boundaries of each of these fragments and sub-fragments were labeled in a way that the amplified fragments will be having 25–30 base pair overlap with their next-in-line designed fragment and sub-fragments. Internal restriction sites were designed and chosen to ensure a backup strategy in case of any difficulty in assembling the full-length gene of approximately 4000 base pairs. It is important to remember that these sequences belong to a viral genome that is not having a uniform distribution of the nucleotides and may not be able to assemble properly. First, we amplified fragments A1, A2, B1, B2, C1, C2, C3, C4, D1, and D2; later A1-A2 were joined using the end oligonucleotides; similarly, B1-B2 were joined using their end oligonucleotides; C1-C2 and C3-C4 were joined using their end oligonucleotides, and D1-D2 were also joined using their end oligonucleotides. In a later reaction, C1-C2 and C3-C4 were also joined using their end oligonucleotides before the final assembly of all these fragments. Final PCR assembly resulted in two fragments ApaI – AB – BbvCI and BbvCI – CD – SacII.
Cloning
PCR-based assembly is highly impacted by multiple parameters such as choice of DNA polymerase enzyme, oligonucleotide size, oligonucleotide concentration, annealing temperature, and the total number of PCR cycles required for amplification5. Here we optimized the oligonucleotide concentration and annealing temperatures to ensure the correctness of the synthesized frame for fragment A. The finalized conditions from fragment A were then reciprocated for all the other fragments and sub-fragments. For the synthesis of the sub-fragment A1 and A2, the PCR was set up, containing 200µM of each dNTPs (Roche, Cat. 11969064001), 50x Advantage® 2 DNA polymerase mix (Takara, Cat. 639206) in 10x Advantage® 2 buffer (Takara, Cat. 639206). Three different concentration combinations of inner oligonucleotides (20 nM, 50 nM, and 200 nM) and outer oligonucleotides (100 nM, 100 nM, and 400 nM) were tested at different annealing conditions in the PCR cycle. PCR was set up in Bio-Rad Tetrad thermal cycler (398BR81) with cycle conditions as initial denaturation at 95ºC for 4 minutes; followed by 16 cycles at 95ºC for 45 seconds; annealing at 59ºC, 62ºC, 64ºC, 66ºC, 68ºC, 70ºC and 72ºC for 1 minute; extension at 68ºC for 1 minute, and final extension at 68ºC for 7 minutes 30 seconds. The optimized conditions from this section were later used for the amplification of all the sub-fragments of B, C, and D.
Further for the synthesis of fragment A, 10 µl PCR volume of A1 and A2 sub-fragments were used as a template and the amplification was performed using single forward and reverse oligonucleotide at 200nM concentration as shown in Fig. 2. The PCR cycle conditions were, initial denaturation at 95ºC for 4 minutes; followed by 28 cycles at 95ºC for 45 seconds; annealing at 59ºC, 62ºC, 64ºC, 66ºC, 68ºC, 70ºC, and 72ºC for 1 minute; extension at 68ºC for 1 minute; and final extension at 68ºC for 7 minutes 30 seconds. The optimized conditions from this section were later used for the final amplification of all the other fragments. During all these reactions, PCR products of fragments A, B, C, and D were purified using QIAquick PCR Purification Kit (Qiagen, Cat. 28104.), and concentration was determined on NanoDrop™ One/OneC Microvolume UV-Vis Spectrophotometer (ThermoFisher Scientific, Cat. ND-ONE-W).
For the final assembly, purified PCR DNA fragment AB was subjected to restriction digestion using ApaI enzyme (NEB, Cat. R0114S) and BbvCI enzyme (NEB, Cat. R0601S) and the CD fragment was subjected to restriction digestion using, BbvCI enzyme (NEB, Cat. R0601S) and SacII enzyme (NEB, Cat. R01578). pVEE 104a.1 used for cloning as a vector DNA, it is 11720 base pairs in length encoding one of the malarial antigenic proteins and also having 4 “Non-structural proteins” (nsp1-4) from Venezuelan Encephalitis alpha-virus6 7 8 9. The construct also has the following components that include; UTRs which denote the untranslated regions helping in efficient translation function, nptII gene encoding the kanamycin selection marker, and pMB1 plasmid origin. Ligation of 3 digested fragments AB, CD, and pVEE 104a.1 was performed using the T4 DNA ligase enzyme (Roche, Cat. 10481220001). Transformation of the ligated DNA product was done in electro-competent E. coli DH5α cells as per the manufacturer protocol; Electromax (ThermoFisher Scientific, Cat. 11319019) and recombinants were selected on the LB agar plate containing 50 µg/mL kanamycin selection antibiotic. For positive control, pUC19 plasmid and negative control digested vector, AB, and CD fragment alone were used.
Recombinant clone screening was performed by colony PCR using plasmid backbone-specific oligonucleotides in the presence of AmpliTaq Gold DNA polymerase (Applied Biosystem™, Cat. 4311806). Colony PCR-positive clones were inoculated into the LB medium containing 50 µg/mL kanamycin selection antibiotic. Later, plasmid DNA was isolated by the Miniprep method (PureYield™, Promega miniprep system, Cat. A1223). The plasmid DNA was sequenced on the Sanger Automated sequencing platform (Applied Bio-systems 3500xL) using BigDye™ Terminator v3.1 Cycle Sequencing Kit (ThermoFisher Scientific, Cat no. 4337455). The sequencing data analysis was performed on the MacVector software with an assembler [18.2.5(43)]. The sequencing primers are mentioned in Supplementary Table 2.
Plasmid preparation
The circular plasmid DNA was linearized using the NotI restriction enzyme (NEB, Cat. R0189S) and purified by QIAquick PCR Purification Kit (Qiagen, Cat. 28104.). The purified plasmid DNA was quantified using the Quant-iT™ PicoGreen™ dsDNA Assay Kits and dsDNA Reagents (ThermoFisher Scientific, Cat. P7589) 10. Further, real-time quantitative PCR (RT-PCR) was employed to detect the presence of any residual E. coli genomic DNA in the plasmid preparation 11. Briefly, we designed forward (5’-AAGCTGCCTGCACTAATGTTCC-3’) and reverse (5’-TCGCGTACCGTCTTCATGG-3’) primers for the amplification an amplicon of the E. coli genome. RT-PCR was performed using SsoFast™ EvaGreen® Supermix (Biorad, Cat. 1725201). Genomic DNA of E. coli DH5α isolated using GenElute™ Bacterial Genomic DNA Kits (Sigma-Aldrich: Cat no. NA 2110) was used as a reference in the concentration of 50.0 nanograms to 5.0 femtograms. The PCR was performed on the CFX96 Touch Real-Time-PCR at 98ºC for 2 minutes followed by 39 cycles at 98ºC for 5 seconds, 61.5ºC for 5 seconds, one cycle at 95ºC for 10 seconds, and a melting curve at 65ºC to 95ºC.
mRNA preparation
For mRNA preparation, in-vitro transcription was carried out using HiScribe™ T7 High Yield RNA Synthesis Kit (NEB, Cat. E2040S). The in-vitro transcribed mRNA was treated with DNase I enzyme (ThermoFisher Scientific, cat. no. EN0525) for degrading the plasmid DNA and later protected by 5’ capping using the Vaccinia Capping System (NEB, Cat. M2080S)12. The capped mRNA was purified using LiCl precipitation13. Purified mRNA was quantified by Quant-it™ RiboGreen RNA Assay Kit and RiboGreen RNA Reagent (ThermoFisher Scientific, Cat. R11490)14 15.mRNA integrity was confirmed by denaturing agarose gel electrophoresis and capillary electrophoresis on Agilent 5200 Fragment analyzer with a compatible reagent kit (Agilent, Cat. DNF-471 RNA kit_15nt). The purified mRNA was analyzed for the presence of plasmid DNA by RT-PCR. For this, we designed forward (5’-AACGGCTCGTAACATAGG-3’) and reverse (5’-TGGTCGAGCCAACAGAG-3’) oligonucleotide from the nsp 1 region of the pVEE101c.1 plasmid. RT-PCR reaction was performed using SsoFast™ EvaGreen® Supermix (Biorad, Cat. 1725201) on the CFX96 Touch Real-Time-PCR at 98ºC for 2 minutes followed by 39 cycles at 98ºC for 5 seconds, and 53.5ºC for 5 seconds, one cycle at 95ºC for 10 seconds and melting curve at 65ºC to 95ºC. Nucleotide sequence and Poly-A tail length of mRNA were later confirmed through sequencing by synthesis chemistry on the Illumia Miniseq platform. Briefly, mRNA was quantified using Qubit™ RNA High Sensitivity (HS), Broad Range (BR), and Extended Range (XR) Assay Kits (ThermoFisher Scientific, Cat. Q10210) on Qubit 4 Fluorometer (ThermoFisher Scientific). TruSeq® Stranded mRNA Library Prep (48 Samples) was used for the mRNA library preparation (Illumina, Cat. 20020594) along with the TruSeq RNA CD Index Plate (96 Indexes, 96 Samples) (Illumina, Cat. 20019792). The library was quantitated using Qubit™ 1X dsDNA High Sensitivity (HS) and Broad Range (BR) Assay Kits (ThermoFisher Scientific, Cat. Q33230) and library sizing was checked on Agilent 5200 Fragment analyzer system with its corresponding DNA kit (Agilent, Cat. DNF-935 reagent kit). This library was run using MiniSeq High Output Reagent Kit (300-cycles) (Illumina, Cat. FC-420-1003) to ensure the paired-end reads on the Illumina Miniseq system, which can generate the read length up to 2× 150 bp and data output of 6.6–7.5 Gb.
Bioinformatics analysis of NGS data
The raw sequencing reads were assessed for performing quality checks such as per base sequence quality, per sequence quality scores, GC content, duplication levels, etc. using the FASTQc toolkit (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Based on the FASTQc report, the reads were trimmed using a command line utility Trimmomatic-0.39 to remove the low-quality reads16. The parameters selected for trimming were LEADING:5, TRAILING:5, SLIDINGWINDOW:4:15, and MINLEN:3617.
The trimmed reads were mapped with omicron variant sequence using Bowtie2 aligner (v.2.4.5)18. Bowtie2 is based on the implementation of the Burrows-Wheeler-transform and makes use of the Full-text Minute-size index (FM-Index) for rapid processing of a large number of sequencing reads19. As a result of the mapping, a SAM file was generated by Bowtie2 that contains reads and their genomic location. The SAM file was converted into the compressed form i.e. BAM file and the reads in the BAM file were sorted based on their aligned read position in the genome using Samtools20. To find the aligned reads efficiently based on a position in the genome, the index of the sorted BAM file was created as a BAI file21. The PCR duplicates present in the reads were removed by employing the Picard tool (http://broadinstitute.github.io/picard/). After that, an Integrated Genomics Viewer (IGV) software (v.2.8.10), a robust, user-friendly interactive tool for visualizing genomics data was used to visualize the mapped reads22 23. For that, the omicron variant sequence and sorted BAM file of the sample were imported into the IGV. Finally, the pair-wise alignment of the omicron variant sequence and aligned consensus sequence of the sample obtained from IGV was performed using the Nucleotide Basic Local Alignment Search Tool (BLASTn) (http://www.ncbi.nlm.nih.gov/blast) and the identity level was checked.
Fluorescence-activated cell sorting
HEK 293T cells (ATCC-CRL-3216) at the density of 2.5 x 105 cells /mL were seeded in a 6-well plate (Costar®, Cat. 3516) and were allowed to attach to the surface overnight. All the cells were cultured in a humidified incubator (ThermoFisher Scientific, Cat. 40878550, Herd Cell) set at 37°C with 5% CO2. The next day, mRNA (10 µg) was diluted in Opti-MEM and lipofectamine 3000 in a ratio of (1:5) and incubated for 15 min. Post-incubation, the complex mixture was added to the cells. The media was changed after 24 h and cells were processed for FACS post 48 h of transfection24.
HEK 293T cells were detached from the surface of a 6-well plate using cell dissociation buffer (Gibco™, Cat. 13151-014), and both the cells and media were collected by centrifugation at 6000 rpm at 4°C for 10 min. The cells were washed thrice with ice-chilled PBS with 5% MACS BSA (Miltenyl Biotech, Cat. 130-091-376, FACS buffer) and blocked with Fc- blocker (4.5 µl, each sample) (BD, Cat. 564219, Pharmingen) for 30 min on ice. Fluorophore–conjugated anti-spike IgG (1:100) (A700) (R&D Biosystem, Cat. FAB105403N) was added to the cells and incubated for 45–60 min. Post-incubation, washing was done three times with FACS-buffer for 5 minutes and finally, the cells were analyzed using FACS (BD FACS Lyric™). The assay was performed twice for assurance of the data and the analysis was done using the FLOWJO (v10) single cell analysis software.