ImmuneRACE experimental cohort and study approval
The ImmuneRACE study is a prospective, single group, multi-cohort, exploratory study of unselected eligible participants exposed to, infected with, or recovering from COVID-19 (NCT04494893). Participants, aged 18 to 89 years and residing in 24 different geographic areas across the United States, were consented and enrolled via a virtual study design. Cohorting was based on participant-reported clinical history following the completion of both a screening survey and study questionnaire.
Cohort 1 included participants exposed within 2 weeks of study entry to someone with a confirmed COVID-19 diagnosis, either based on positive PCR testing or clinician diagnosis. Cohort 2 participants included those clinically diagnosed by a physician or with positive laboratory confirmation of active SARS-CoV-2 infection via PCR testing. Cohort 3 included participants previously diagnosed with COVID-19 disease who have been deemed recovered based on two consecutive negative nasopharyngeal or oropharyngeal (NP/OP) PCR tests, clearance by a healthcare professional, or the resolution of symptoms related to their initial COVID-19 diagnosis. The ImmuneRACE study was approved by Western Institutional Review Board (WIRB reference number 1-1281891- 1, Protocol ADAP-006). All participants were consented for sample collection and metadata use via electronic informed consent processes.
Both whole blood and serum and a nasopharyngeal or oropharyngeal swab were collected from participants by trained mobile phlebotomists. Blood samples were shipped frozen or at room temperature to Adaptive Biotechnologies for processing, including, but not limited to, DNA extraction, and TCRb analysis via the immunoSEQ Assay (Adaptive Biotechnologies, Seattle, WA) from DNA extracted from blood samples (Table 1). NP/OP swabs and serum were sent to Covance/Labcorp for further testing. An electronic questionnaire was administered to collect information pertaining to the participant’s medical history, symptoms, and diagnostic tests performed for COVID-19 disease. Participants have the option to undergo additional blood draws and questionnaires over 2 months.
Global data collaborations
Whole blood samples were collected in K2EDTA tubes based on each institution’s protocol and supervised by their respective Institutional Review Board. Samples were stored at the institution and sent to Adaptive as frozen whole blood, isolated PBMC or DNA extracted from either sample type for TCRb analysis via the immunoSEQ
Assay (see Table 1). Samples provided by the NIAID were collected under approval by Comitato Etico Provinciale (protocol NP-4000), by Comitato Etico, Ospedale San Gerardo Monza (protocol COVID-STORM) and by Comitato Etico Pavia Fondazione IRCCS Policlinico San Matteo, Pavia (protocol 20200037677). Whole blood samples from DLS (Discovery Life Sciences, Huntsville, AL) were collected under Protocol DLS13 for collection of remnant clinical samples. From Bloodworks Northwest (Seattle, WA), volunteer donors recovered from COVID-19 were consented and collected under the Bloodworks Research Donor Collection Protocol BT001. Samples were processed for PBMC and donor data reported by the Biological Products division of Bloodworks NW under standard operating procedures.
Sample analysis
A subset of the samples were processed for both T-cell receptor variable beta chain sequencing and MIRA, and another subset was processed only by one of these approaches. For each subject included in the dataset, SubjectID can be used to determine which assay the samples were processed in.
T-cell receptor variable beta chain sequencing
Immunosequencing of the CDR3 regions of human TCRβ chains was performed using the immunoSEQ Assay as previously described6,7,8. In brief, extracted genomic DNA was amplified in a bias-controlled multiplex PCR, followed by high-throughput sequencing. Sequences were collapsed and filtered in order to identify and quantitate the absolute abundance of each unique TCRβ CDR3 region for further analysis.
Multiplexed Identification of TCR Antigen Specificity (MIRA)
To identify antigen-specific TCRs, T cells derived post-expansion from either of the above input cell types were used for the MIRA tool. Antigen-specific TCRs were identified as previously described9,10. Briefly, T cells were incubated overnight with MIRA peptide pools, and the antigen-specific subset was identified by CD137 upregulation. Following addition of peptides, cells were incubated at 37°C for ~18 hours. At the end of the incubation, replicate wells of cells were harvested from the culture and pooled and then stained with antibodies for analysis and sorting by flow cytometry. Cells were then washed and suspended in PBS containing FBS (2%), 1mM EDTA and 4,6-diamidino- 2-phenylindole (DAPI) for exclusion of non-viable cells. Cells were acquired and sorted using a FACS Aria (BD Biosciences) instrument. Sorted antigen-specific (CD3+CD8+CD137+) T cells were pelleted and lysed in RLT Plus buffer for nucleic acid isolation. Analysis of flow cytometry data files was performed using FlowJo (Ashland, OR).
RNA was isolated using AllPrep DNA/RNA mini and/or micro kits, according to manufacturer’s instructions (Qiagen). RNA was reverse transcribed to cDNA using Vilo kits (Life Technologies). TCRβ amplification, sequencing and clonotype determination were performed as described in the ‘T-cell receptor variable beta chain sequencing’ section above.
MIRA tool design
T-cell populations were exposed to pooled peptides or transgenes in a combinatoric format, similar to the approach described in reference 10. According to the MIRA panel design, each antigen is strategically placed in a subset of K unique pools while being omitted from the remaining pools (total pools = N). This design allows for antigens to be placed into a unique combination of N choose K occupancies (or also referred to as “addresses”), and allows for increased economies of scale as the number of replicate pools (N) increases. In order to estimate an empirical false discovery rate and gauge assay quality, we purposefully left > 40% of the unique occupancies empty to assess the rate at which are clones are spuriously sorted and detected in K pools with no query antigen present (hereinafter referred to as invalid TCR associations).
Matching clonotypes to antigens
T cells were aliquoted into 11 pools, and activated T cells were sorted using T-cell markers after overnight stimulation, as described previously10. These putative antigen responding cells were set aside to characterize the T-cell clonotypes present in each sorted pool using the immunoSEQ Assay as described above. After immunosequencing, we examined the behavior of T-cell clonotypes by tracking the read counts of each unique TCRb sequence across each sorted pool. True antigen-specific clones should be specifically enriched in a unique occupancy pattern that corresponds to the presence of one of the query antigens in K pools. We have reported on methods to assign antigen specificity to TCR clonotypes previously12; in addition we also developed a non-parametric Bayesian model to compute the posterior probability that a given clonotype is antigen specific. This model uses the available read counts of TCRs to estimate a mean-variance relationship within a given experiment and as well as the probability that a clone will have zero read counts due to incomplete sampling of low frequency clones. Together, this model takes the observed read counts of a clonotype across all N pools and estimates the posterior probability of a clone responding to all possible N choose K addresses and an additional hypothesis that a clone is activated in all pools (truly activated, but no specific to any of our query antigens). To define antigen specific clones, we identified TCR clonotypes assigned to a query antigen from this model with a posterior probability >= 0.9.