Study participants
Patients were recruited from the John Radcliffe Hospital in Oxford, UK, between March 2020 and April 2021 by identification of patients hospitalised during the SARS-COV-2 pandemic. Patients were recruited into the Sepsis Immunomics study and had samples collected during their convalescence as well as during acute disease. Patients were sampled at least 28 days from the start of their symptoms. Written informed consent was obtained from all patients. Ethical approval was given by the South Central-Oxford C Research Ethics Committee in England (Ref 13/SC/0149).
Clinical definitions
All patients were confirmed to have a positive test for SARS-CoV-2 using reverse transcriptase polymerase chain reaction (RT-PCR) from an upper respiratory tract (nose/throat) swab tested in accredited laboratories. The degree of severity was identified as mild, severe or critical infection according to recommendations from the World Health Organisation. Patients were classified as ‘mild’ if they did not require oxygen (that is, their oxygen saturations were greater than 93% on ambient air) or if their symptoms were managed at home. A large proportion of our mild cases were admitted to hospital for public health reasons during the early phase of the pandemic even though they had no medical reason to be admitted to hospital. Severe infection was defined as COVID-19 confirmed patients with one of the following conditions: respiratory distress with RR>30/min; blood oxygen saturation<93%; arterial oxygen partial pressure (PaO2) / fraction of inspired O2 (FiO2) <300mmHg; and critical infection was defined as respiratory failure requiring mechanical ventilation or shock; or other organ failures requiring admission to ICU. Since the Severe classification could potentially include individuals spanning a wide spectrum of disease severity ranging from patients receiving oxygen through a nasal cannula through to non-invasive ventilation, we also calculated the SaO2/FiO2 ratio at the height of patient illness as a quantitative marker of lung damage. This was calculated by dividing the oxygen saturation (as determined using a bedside pulse oximeter) by the fraction of inspired oxygen (21% for ambient air, 24% for nasal cannula, 28% for simple face masks and 28, 35, 40 or 60% for Venturi face masks or precise measurements for non-invasive or invasive ventilation settings). Patients not requiring oxygen with oxygen saturations (if measured) greater than 93% on ambient air, or managed at home were classified as mild disease. Viral swab Ct values were not available for all patients. In addition, we have standardised all of our analyses to the days since symptom onset.
Generating ACE2 transduced Epstein-Barr virus (EBV)-transformed B cell lines.
Epstein-Barr virus (EBV)-transformed B cell lines (BCLs) were generated as described previously 26. The cDNA for the human Angiotensin Converting Enzyme 2 (ACE2) gene (ENSG00000130234) was cloned into a lentiviral vector that allows co-expression of eGFP and a Puromycin resistance marker (Addgene, Plasmid 17488). The plasmids were co-transfected with the packaging plasmids pMD2.G and psPAX2 into HEK293-TLA using PEIpro (Polyplus). Lentiviral supernatant was collected 48h and 72h post transfection and concentrated by ultracentrifugation. EBV-transformed BCLs were infected by ACE2-lentivirus at MOI 0.1 with the addition of 8µg/ml polybrene (Sigma-Aldrich, Cat. # TR-1003-G) overnight, then washed, and further cultured for 3 - 5 days. ACE2-expressing B cells were stained using a primary goat anti-human ACE2 antibody (R&D, AF933) and a donkey anti-goat AF647 secondary antibody (Abcam, Cat. # ab150135) followed by cell sorting by flow cytometry. B cells with stable expression of ACE2 were maintained with 0.5µg/ml puromycin (ThermoFisher, Cat. # A1113803). Mycoplasma test was caried out every four weeks with all the cell lines using Lonza MycoAlert detection kit (Lonza,
Generating T cell lines and clones
Short-term SARS-CoV-2-specific T cell lines were established as previously described 17. Briefly, 3 × 106 to 5 × 106 PBMCs were pulsed as a pellet for 1 h at 37°C with 10μM of peptides containing T cell epitope regions and cultured in R10 (RPMI 1640 medium with 10% foetal calf serum (FCS), 2mM glutamine and 100mg/ml pen/strep) at 2 × 106 cells per well in a 24-well Costar plate. IL-2 was added to a final concentration of 100U/mL on day 3 and cultured for a further 10 - 14 days. T cell clones were generated by sorting HLA-B*07:02 NP105-113 Pentamer+ CD8+ T cells at a single cell level from thawed PBMCs or short-term cell lines. T cell clones were then expanded and maintained as described previously 27.
Generating Vaccinia virus expressing SARS-CoV-2 NP
SARS-CoV-2 nucleocapsid (NP) expression vectors (gifts from Dr. Peihui Wang, Shandong University, Shandong, China 28) were first digested with KpnI and SacII. The resulting fragment was then cloned into VACV expression vector pSC11, which was inserted with a DNA segment encoding KpnI and SacII digestion sites (GGTACCGCGGCCGCCCGCGG). The SARS-CoV-2 NP-expressing recombinant Vaccinia virus (rVACV) was produced as described previously. 29, 30, 31. In brief, HEK293T cells (ATCC, CRL-11268) were transfected with 3μg of pSC11 containing NP in the presence of polyethylenimine. At 24h post transfection, cells were infected with the Lister strain of VACV at MOI 1 for 48h. The infected cells were collected for recombinant virus purification using TK143B cells (ATCC, CRL-8303) in the presence of 25μg/ml bromodeoxyuridine. The NP-expressing rVACV was selected through β-galactosidase staining by supplementing 25μg/ml X-gal to an agarose overlay. Master stocks of rVACV were prepared by infection on rabbit RK13 (ATCC, CCL37) and titrated on African green monkey BS-C-1 (ATCC, CCL26) cells.
SARS-CoV-2 live virus propagation and titration.
SARS-CoV-2 Victoria 01/20 strain (BVIC01), and variants of concerns: Alpha (Lineage B1.1.7, 20I/501Y.V1.HMPP1) and Beta (Lineage B1.351, 201/501.V2.HV001) were originally from Public Health England and provided by Prof. Jane McKeating 32. SARS-CoV-2 Gamma (Lineage P.1) was provided by Prof. Gavin Screaton 33. Virus was propagated and titrated as previously described. 32, 33. In brief, Victoria 01/20, Alpha and Beta were propagated with Vero E6 cells, whereas SARS-CoV-2 Gamma was propagated with Vero E6/TMPRSS2 (kindly provided by Alain Townsend). Naïve Vero E6 or Vero E6/TMPRSS2 cells were plated one night before and infected with SARS-CoV-2 at MOI of 0.003. The cultures were harvested when visible cytopathic effect was observed 48 - 72h after infection and residual cell debris was removed by centrifugation. The virus-containing supernatant was aliquoted and stored at −80°C. Viral titre was determined by plaque assay with Vero E6 or Vero E6/TMPRSS2 as previously described, and plaque-forming units (PFU) per mL was used to calculate MOI.
IFN-γ ELISpot assay
Ex vivo IFN-γ ELISpot assays were performed using either freshly isolated, cryopreserved PBMCs or antigen-specific T cell clones as described previously 3. For ex vivo ELISpots, peptides were added to 200,000 PBMCs per test at the final concentration of 2μg/mL for 16–18 h. When using T cell clones, autologous EBV-transformed B cell lines were first loaded with peptides at 3-fold titrated concentrations and were subsequently co-cultured with T cells at an effector: target (E:T) ratio of 1:50 for at least 6h. To quantify antigen-specific responses, mean spots of the control wells were subtracted from the positive wells (PHA stimulation), and the results expressed as spot forming units (SFU)/106 PBMCs. Responses were considered positive if results were at least three times the mean of the negative control wells and >25SFU/106 PBMCs. If negative control wells had >30SFU/106 PBMCs or positive control wells were negative, the results were excluded from further analysis.
Flow cytometric sorting of NP105-113-B*07:02 -specific CD8+ T cells
NP105-113-B*07:02 specific CD8+ T cells were stained with PE-conjugated HLA-B7 NP105-113 Pentamer (ProImmune, Oxford, UK). Live/Dead fixable Aqua dye (Invitrogen) was used to exclude non-viable cells from the analysis. Subsequently, cells were washed and stained with the following surface antibodies: CD3-FITC (BD Biosciences), CD8-PercP-Cy5.5, CD14-BV510, CD19-BV510 and CD16-BV510 (Biolegend). After the final wash, cells were resuspended in 500μl of PBS, 2 mM EDTA and 0.5% BSA (Sigma-Aldrich) solution and kept in dark at 4°C until flow cytometric acquisition. After exclusion of non-viable/CD19+/CD14+/CD16+ cells, CD3+ CD8+ Pentamer+ cells were sorted directly into 96-well PCR plates (Thermo Fisher, UK) using a BD Fusion sorter or BD FACS Aria III (BD Biosciences) and stored at − 80 °C for subsequent analysis.
Single cell RNA Sequencing (RNA-Seq)
Single cell RNA-Seq with ex vivo sorted CD8+Pentamer+ T cells was performed using SmartSeq2 34. with following modifications. Reverse-transcription and PCR amplification were performed as described 34 with the exception of using ISPCR primer with biotin-tagged at 5’ and increasing the number of cycles to 25. Sequencing libraries were prepared using the Nextera XT Library Preparation Kit (Illumina) and sequencing was performed on Illumina NextSeq sequencing platform.
Deep sequencing of T cell receptor (TCR) repertoire of T cell clones
100,000 cells from each T cell clone were harvested and washed three times with PBS. Total RNA was extracted using RNeasy Plus Micro Kit (Qiagen, Cat. # 74034). 100ng of total RNA from each of T cell clone was used to generate full length of TCR repertoire libraries for Illumina Sequencing using SMARTer Human TCR a/b Profiling Kit (Takara, Cat. # 635016) following supplier’s instructions. In brief, first-strand cDNA was synthesised by reverse transcription using TRBC reverse primers and further extended with a template-switched SMART-Seq®v4 Oligonucleotide. Following reverse transcription, two rounds of PCR were performed in succession to amplify cDNA sequences corresponding to variable regions of TCR-α and/or TCR-β transcripts with primers including Illumina indexes allowing for sample barcoding. PCR products were then purified using AMPure beads (Beckman Coulter). The quantity and quality of cDNA libraries were checked on Agilent 2100 Bioanalyzer system. Sequencing was performed with MiSeq reagent Kit v3 (600 cycles) on MiSeq (Illumina).
Intracellular cytokine staining (ICS)
Intracellular cytokine staining was performed as described previously 3. Briefly, T cells were co-cultured with peptide-loaded or virus-infected BCLs at an appropriate E:T ratio for a 6h incubation with GolgiPlug and GolgiStop, and surface stained with PE-anti-CD107a. Dead cells were labelled using LIVE/DEAD™ Fixable Aqua dye (Invitrogen); after staining with BV421-anti-CD8, cells were then washed, fixed with Cytofix/CytopermTM and stained with AF488-anti-IFNγ, APC-anti-TNFα (eBioscience) and APC-H7-anti-MIP1ß. Negative controls without peptide-stimulation or virus infection were run for each sample. All reagents were from BD Bioscience unless otherwise stated. All samples were acquired on Attune™ NxT Flow Cytometer and analyzed using FlowJoTM v.10 software (FlowJo LLC).
Evaluation of T cell response to Vaccinia virus infection
EBV transformed B cell lines were infected with Lister strain Vaccinia virus at a MOI 3 for 90-120 mins at 37°C. Cells were then washed three times to remove virus and incubated overnight in R10 at 37°C. The next day, cells were counted and co-cultured with T cells at an E:T ratio of 1:1. Degranulation (CD107a expression) and cytokine production of T cells were then evaluated by ICS as described above.
Evaluation of T cell response to live virus infection
EBV transformed BCLs expressing ACE2 were infected SARS-CoV-2 viruses at a MOI 3 for 120 mins at 37°C. Cells were then washed three times and incubated in R10 at 37°C. After 48h, cells were then counted and co-cultured with T cells at an E:T ratio of 1:1. Degranulation (CD107a expression) and cytokine production of T cells were then evaluated by ICS as described above.
Live Virus Suppression Assay (LVSA)
EBV transformed BCLs expressing ACE2 were infected with SARS-CoV-2 viruses at MOI 0.1 for 120 mins at 37°C. Cells were then washed three times and co-cultured with T cells at an E:T ratio of 4:1. Control wells containing virus-infected targets without T cells were also included. After 48hrs incubation, the cells were washed with PBS and lysed with Buffer RLT (Qiagen). RNA was extracted using RNeasy 96 kit (Qiagen, Cat. # 74181). Virus copies were then quantified with TakyonTM Dry one-step RT-qPCR (Eurogentec, Cat. # UFD-NPRT-C0101) using SARS-CoV-2 (2019-nCoV) CDC qPCR Probe Assay (IDT, ISO 13485:2016) and human B2M (Beta-2-Microglobulin) as an endogenous control (Applied Biosystems™). The suppression rate was calculated by the percentage of reduction of virus replication by T cells.
SmartSeq2 single cell data processing
BCL files were converted to FASTQ format using bcl2fastq version 2.20.0.422 (Illumina). FASTQ files were aligned to human genome hg19 using STAR version 2.6.1d 35. Reads were counted using featureCounts (part of subread version 2.0.0, 36). The resulting counts matrix was analysed in R version 4.0.1 using Seurat version 3.9.9.9010 37.
SmartSeq2 single cell RNA sequencing analysis
After creating a Seurat object from the counts matrix, cells were filtered using the following criteria: minimum number of cells expressing specific gene = 3, minimum number of genes expressed by cell = 200 and maximum number of genes expressed by cell = 4000. Cells were also excluded if they expressed more than 5% mitochondrial genes. Patient-specific cells were integrated using Harmony version 1.0 to remove batch effects (differences between patients) but to still retain meaningful biological variation. The AddModuleScore function from the Seurat package was used to look at expression of specific gene sets (Supplementary Table). The average expression of a gene set was calculated, and the average expression levels of control gene sets were subtracted to generate a score for each cell relating to that particular gene set. Higher scores indicate that that specific signature is expressed more highly in a particular cell compared to the rest of the population. Module scores were plotted using ggplot2 version 3.3.2 38.
SmartSeq2 TCR repertoire analysis
TCR sequences were reconstructed from RNA sequencing FASTQ files using MiXCR version 3.0.13 39, 40 and the command mixcr analyze shotgun to produce separate TRA and TRB output files for analysis. The output text files were parsed into R using tcR version 2.3.2. For paired αβ TCRs, cells were filtered to retain only 1α1β or 2α1β cells. Circos plots showing paired αβ TCRs were created using circlize version 0.4.12 41. Separately, lists were generated for all 1β cells (regardless of number of α) to use for downstream analysis.
Clustering
The input data for clustering was all 1β from SmartSeq2 single cells and 1β from bulk sequencing T cell clones. Single cells and clones were grouped by Vβ usage first; TCRs from either single cells or clones which had unique Vβ gene usage were excluded. Each Vβ group was broken down into subgroups based on CDR3β sequence; any TCRs from either single cells or clones that contained unique CDR3β sequences were excluded. Only TRBV27, TRBV28, TRBV5-1 showed multiple CDR3β sequences with the same gene usage. After plotting the EC50 values of the T cell clones, groups were classified as low or high functional avidity based on a manually defined cut-off (EC50 0.11). This led to a list of 18 groups with unique Vβ gene usage and CDR3β sequences shared among the TCRs from single cell sequencing and bulk T cell clone sequencing.
In order to group as many of the SmartSeq2 single cells into one of these 18 groups, the stringsim function was used from the stringdist package version 0.9.6 42 to compare the similarity between all SmartSeq2 CDR3β sequences and each of the 18 CDR3β from the single cell/clone grouping. A minimum similarity score of 0.7 was used to decide if a TCR from a SmartSeq2 single cell should belong to one of the 18 groups. Once allocated, the single cell was annotated as being high or low functional avidity based on its group number.
TCR sequencing from T cell clones (bulk sequencing)
BCL files converted to FASTQ files as before. TCRs reconstructed using MiXCR command mixcr analyze amplicon, and the resulting output files (TRA and TRB) were parsed into R using tcR as before. TCRs were filtered to retain only 1α1β for each clone. TCR clonotypes (defined as Vβ gene usage and CDR3β sequence) were compared between single TCR and bulk TCR sequencing using using ggalluvial version 0.12.2 43. The predicted functional avidity annotation was overlayed onto the plots using the stringsim function as previously described to classify TCRs into high or low functional avidity groups (minimum score 0.5).
10X VDJ sequencing
Raw files were processed using 10x Genomics Cellranger version 5.0.0 44. BCL files were converted to FASTQ format using cellranger mkfastq and TCRs were reconstructed using cellranger vdj. To carry out donor deconvolution from multiplexed single cell data, cellSNP version 0.3.2 45 was first used to generate a list of SNPs from cellranger output (BAM file). Subsequently Vireo version 0.5.6 46 was used to demultiplex the sequencing data into individual patients from the pooled sequenced libraries, based on the previously generated SNPs list. The filtered_contig_annotations.csv file was annotated with patient information and used for subsequent downstream analysis. TCRs from 10X sequencing represent six months convalescence and were compared to one month convalescence TCRs (SmartSeq2) from the same patient using ggalluvial. The predicted functional avidity annotation was overlayed onto the plots using the stringsim function as previously described to classify TCRs into high or low functional avidity groups (minimum score 0.5).
Gene expression analysis and cell subtyping from acute COVID-19 dataset
Normalised single cell gene expression data for T cells from the COMBAT dataset (level 2 subsets a and b) 14 was annotated with specific T cell subtypes according to COMBAT multimodal analysis (/CBD-10X-00010/multimodal.annotation.release.1.0.tsv); COMBAT TCR chain information (CBD-10X-00008/tcr_chain_information.tsv.gz); and patient metadata (/CBD-CLINNORM-00006/COMBAT_clinical_basic_data_freeze_160221.txt). Any cells without both a CD8+ multimodal major cell type classification and TCR chain information were excluded from further analysis. A simplified severity grouping based on the WHO ordinal scale which ranges from 0 to 8 (https://www.who.int/blueprint/priority-diseases/key-action/COVID-19_Treatment_Trial_Design_Master_Protocol_synopsis_Final_18022020.pdf) was used to classify participants into either Uninfected (0), Mild (1-4), Severe (5-7) or Death (8).
GLIPLH2 analysis
A GLIPH2 CD8+ TCR input file was created from the following datasets: COMBAT 10x paired chain single cell and bulk TCR from all available participants 14; pentamer sorted NP105-113-B*07:02 specific TCR sequences and clonally expanded cells used to test functional avidity processed using MiXCR (as described previously); and NP105-113-B*07:02 specific TCR sequences from Lineburg and Nguyen datasets 7, 10. Clonotypes were defined as having a unique combination of CDR3β amino acid sequence, TRBV gene, TRBJ gene and CDR3α amino acid sequence. Where no or multiple CDR3α sequences were available for a cell, an NA value was used for the CDR3α field in accordance with GLIPH2 input guidelines. For each clonotype, additional information indicating dataset origin was appended as part of the "condition" field. For the 10x COMBAT dataset, CD8+ clonotypes were distinguished from CD4+ clonotypes based on the multimodal classification of cells within each clone.
A matching GLIPH2 participant HLA input file was created using COMBAT formal HLA typing data and where no formal typing was available from imputed HLA typing 3, 14, in addition to published HLA data relating to the Lineburg and Nguyen datasets 7, 10.
The GLIPH2 irtools.centos version 0.01 15 was run on a CentOS Linux platform (release 8.3.2011) using the CD8+ TCR and HLA input files above, together with CD8+ specific V-gene usage, CDR3 length and TCR reference files from the GLIPH2 repository [http://50.255.35.37:8080/] and using the following parameters: local_min_pvalue=0.001; p_depth = 1000; global_convergence_cutoff = 1; simulation_depth=1000; kmer_min_depth=3; local_min_OVE=10; algorithm=GLIPH2; all_aa_interchangeable=1; number_of_hla_field=1; hla_association_cutoff=0.050000. A GLIPH score summary file was then programmatically curated, identifying convergence groups containing TCRs known to be NP105-113-B*07:02 specific as described previously, with associated GLIPH2 scoring and HLA prediction.
Convergence groups from this file were further categorized as being associated with or lacking association with HLA-B*07:02 based on having a GLIPH2 HLA score of <0.05 or >=0.05 respectively. Only clonotypes belonging to a HLA-B*07 associated convergence group which were from participants known to have a HLA-B*07:02 allele were deemed HLA-B*07:02 positive TCRs. Any clonotypes from convergence groups lacking HLA-B*07:02 association but belonging to patients having a HLA-B*07:02 allele were deemed ambiguous and excluded from the HLA-B*07:02 negative clonotype set.
Similarity between pre-pandemic and convalescent COVID-19 TCRs
NP105-113 specific TCRs from pre-pandemic individuals (predicted from COMBAT dataset or experimentally defined from Lineburg et al. and Nguyen et al.) were compiled to form a single list of sequences (237 TCRs). The function stringsim was used to generate similarity scores from pairwise comparisons between each CDR3β sequence from the pre-pandemic/healthy list and each CDR3β sequence from 85 unique clonotypes from four convalescent COVID-19 patients (clonotype defined per patient, TRBV gene usage and CDR3β sequence). A score of 1 indicates total similarity while a score of 0 is total dissimilarity. Each score was plotted on a box plot using ggplot2 version 3.3.2.
Pseudobulk and differential gene expression
RNA counts from SmartSeq2 single cells were aggregated into groups based on patient origin and high/low functional avidity, and converted to a Single Cell Experiment object version 1.10.1 47. Differential gene expression was conducted using DESeq2 version 1.28.1 on aggregated (pseudobulk) counts. Significant genes were visualised on a heatmap using pheatmap version 1.0.12.
Statistics
Mann Whitney non-parametric test to compare two groups (R Studio); other statistical tests carried out using GraphPad Prism. Nonlinear regression with variable slope (four parameters) in dose-response-stimulation model was used for calculating the EC50 of T cell clones. ns not significant, * P < 0.05, ** P < 0.01, *** P < 0.001, **** P < 0.0001.