Cohort recruitment, biological sample collection, and initial phenotyping
Protection of human subjects was in accordance with research protocols approved by the Duke University Institutional Review Board, consistent with the Declaration of Helsinki. Written informed consent was obtained from all research subjects or their legally authorized representatives. Subjects with confirmed or suspected SARS-CoV-2 infection or their close contacts were identified in the outpatient setting and enrolled into the Molecular and Epidemiological Study of Suspected Infection protocol (MESSI, IRB Pro00100241). All close contacts and subjects with mild or moderate COVID-19 were longitudinally sampled from enrollment to convalescent phase. Biological samples and demographic information were collected prospectively at first visit (Day 0) and at weekly intervals on Day 7 and Day 14. At each visit, infection with SARS-CoV-2 was confirmed using quantitative polymerase chain reaction (qPCR) of nasopharyngeal (NP) swab samples, and serology testing was performed for IgG against the SARS-CoV-2 spike domain. All subjects with mild or moderate COVID-19 progressed from seronegative (IgG-) to seropositive (IgG+). Close contacts were qPCR negative and IgG- at all time points; healthy controls were enrolled pre-pandemic and were not tested for SARS-CoV-2 or spike protein IgG. Self-reported symptom surveys were performed at each visit. To quantify symptom severity, the sum of 38 defined symptoms, each scored 0-4 (0-none, 1-mild, 2-moderate, 3-severe, 4-very severe), was determined from symptom onset through each longitudinal sample collection. SARS-CoV-2 q-PCR tests used virus RNA extracted from NP samples in 140 uL of viral transport medium (VTM) according to manufacturer’s instructions (QiaAmp Viral RNA minikit). SARS-CoV-2 nucleocapsid (N1) and human RNase P (RPP30) RNA copies were determined using 5 µL of isolated RNA in the CDC-designed kit (CDC-006-00019, Revision: 03, Integrated DNA Technologies 2019-nCoV kit). SARS-CoV-2 IgG ELISA tests for antibody response were performed using the anti-SARS-CoV-2 spike S1 domain IgG ELISA assay (EUROIMMUN Medizinische Labordiagnostika AG, Lübeck, Germany).
Purification of PBMCs
PBMCs were purified using the Ficoll-Hypaque density gradient method according to manufacturer’s instructions. Briefly, whole blood was collected in ACD Vacutainer tubes (BD) and processed within 8 hours by dilution 1:2 in PBS, layered onto the Ficoll-Hypaque (Sigma Aldrich) in 50 ml conical tubes, and centrifuged at 420 x g for 25 minutes. Buffy coat was collected, washed twice in D-PBS by centrifugation at 400 x g for 10 minutes to isolate peripheral blood mononuclear cells (PBMCs) which were assessed for viability and cell count using a Vi-Cell automated cell counter (Beckman-Coulter). PBMCs were adjusted to 10 x106 cells/ml in cryopreservation media (90% FBS, 10% DMSO), frozen at -80oC using CoolCell LX (BioCision) for 12-24 hours and stored in liquid nitrogen vapor phase.
RNA extraction, total RNA-seq, and analysis
RNA was extracted from 300K cells using the Zymo Direct-zol miniprep kit (Cat# R2051) and RNA quality assessed using the Agilent DNA tape screen assay. The RNA Integrity Number (RIN) scores for all samples were > 7.0. Total RNA libraries were generated using the NuGEN Ovation® SoLo RNA-Seq Library Preparation Kit (Cat# 0500-96). Libraries were sequenced using Illumina NovaSeq 6000 instrument with S4 flow cell and 150 base pair paired-end reads.FASTQ files were generated from the NovaSeq BCL outputs and quality was assessed with FASTQC 29. Differentially expressed genes were identified between subjects with different disease severity using the limma package and voom to model variance30.
Nuclei purification, ATAC-seq and analysis
Nuclei were extracted from frozen PBMCs. Briefly, 100K cells were spun down at 300 x g for 5 minutes at 4oC. The supernatant was removed, and cells were mixed with 100 uL of lysis buffer (10mM NaCl, 3mM MgCl2, 10mM Tris-HCl pH7.4, 0.1% Tween-20, 0.1% NonidetTM P40) and lysed on ice for 4 minutes. Wash buffer (1 mL; 10mM NaCl, 3mM MgCl2, 10mM Tris-HCl pH7.4, 0.1% Tween20) was added before centrifuging at 500 x g for 5 minutes at 4oC. ATAC-seq libraries were generated as described 14. Briefly, transposition mix (25 μL 2× TD buffer, 2.5 μL transposase (Tn5, 100 nM final), 22.5 μL water) (Illumina Cat# 20031198) was added to the nuclear pellets, incubated at 37°C for 30 minutes, and samples purified using the Qiagen MinElute PCR Purification Kit (Qiagen Cat#28004). DNA fragments were PCR amplified for a total of 10-11 cycles and resulting libraries purified using the Qiagen MinElute PCR Purification Kit. The libraries were sequenced with an Illumina Novaseq 6000 S4 flow cell using 100 bp paired-end reads. FASTQ files were generated from the NovaSeq BCL outputs and used as input to the ENCODE ATAC-seq pipeline (https://github.com/ENCODE-DCC/atac-seq-pipeline) using the MACS2 peak-caller with all default parameters. Differential accessibility was calculated between groups of subjects with different disease severity using the csaw package31.
Single-cell (sc)RNA-seq and analysis
Frozen PBMCs were thawed, and count and cell viability were measured by Countess II. The cell viability exceeded 80% for all samples except PBMC samples from CC subjects, which had viability between 70-80%. For single cell (sc) RNA-seq, 200K cells were aliquoted, spun down, resuspended in 30 uL PBS+0.04%BSA+0.2U / uL RNase inhibitor, and counted using Countess II. GEM generation, post GEMRT cleanup, cDNA amplification, and library construction were performed following 10X Genomics Single Cell 5’ v1 chemistry and quality was assessed using Agilent DNA tape screen assay. Libraries were then pooled and sequenced using Illumina NovaSeq 6000 platform with the goal of reaching saturation or 20,000 unique reads per cell on average. Sequencing data were used as input to the 10x Genomics Cell Ranger pipeline to demultiplex BCL files, generate FASTQs, and generate feature counts for each library. Dimensionality reduction and cell type annotation was accomplished using gene-barcode matrices generated using CellRanger count were analyzed using Seurat 3 with the default parameters unless otherwise specified32. Regulatory network inference was accomplished by converting the scRNA-seq Seurat object into a SingleCellExperiment and used as input to analysis with the SCENIC package33.
scATAC-seq and analysis
PBMCs were thawed and nuclei were extracted as for ATAC-seq. The single-cell suspensions of scATAC-seq samples were converted to barcoded scATAC-seq libraries using the Chromium Single Cell 5′ Library, Gel Bead and Multiplex Kit, and Chip Kit (10x Genomics). The Chromium Single Cell 5′ v2 Reagent (10x Genomics, 120237) kit was used to prepare single-cell ATAC libraries according to the manufacturer’s instructions. Quality was assessed using Agilent DNA tape screen assay. Libraries were then pooled and sequenced using Illumina NovaSeq platform with the goal of reaching saturation or 25,000 unique reads per nuclei on average. Sequencing data were used as input to the 10x Genomics Cell Ranger ATAC pipeline to demultiplex BCL files, generate FASTQs, and generate feature counts for each library.scRNA-seq and scATAC-seq were integrated using fragment file outputs generated using CellRanger ATAC count were analyzed using ArchR following the standard workflow and with default parameters unless otherwise specified34. Feature and motif enrichment analysis (peak calling) was performed using MACS2 via the addReproduciblePeakSet() method in ArchR which uses pseudo-bulk replicates of cells grouped on a specific design variable. The correlations between chromVAR transcription factor deviation scores and scRNA-seq derived gene expression data were calculated using the ArchR method correlateMatrices() to identify activators and repressors. Topic-based clustering was performed for the CD14+ monocytes from the mild or moderate subject cohorts using the R package cisTopic35.
Detailed methods for qPCR and antibody tests, and epigenetic and genomic profiling, including RNA extraction, sequencing, differential expression analysis and differential chromatin accessibility analysis for both bulk and single cell analytic approaches can be found in Supplemental Methods.