Sample collection
Thymuses were obtained from neonates and infants undergoing cardiac surgery at the NewYork- Presbyterian Hospital. Cord blood samples were obtained from Carolinas Cord Blood Bank (Duke University). Healthy buffy coats from adults were obtained from the New York Blood Center. This study was approved by the Columbia University institutional review board.
Sample processing
Thymic tissue was collected in cold phosphate-buffered saline (PBS) and washed extensively to remove the blood. A piece of tissue was kept on 10% paraformaldehyde solution for 24h, then placed on 70% ethanol and finally paraffin embedded for histological analysis. The remaining tissue was homogenized using a gentleMACS tissue dissociator (Miltenyi Biotec) and filtered through a 40-µm cell strainer (BD Biosciences). Peripheral blood mononuclear cells (PBMCs) were isolated from cord blood and adult blood by Ficoll density gradient using Ficoll-PaquePLUS (GE HealthCare). Thymocyte and PBMC suspensions were frozen in heat inactivated fetal bovine serum (FBS) containing 10% dimethyl sulfoxide (DMSO, Fisher BioReagents) and kept in liquid nitrogen until use.
Flow cytometry and cell sorting
Cryopreserved cells were thawed into warm RPMI media containing 10% FCS and washed in PBS. B cells were isolated by magnetic cell sorting with EasySep Human B Cell Enrichment Kit (Stem Cell Technologies) following the manufacturer’s instructions and stained in PBS with 2% FCS for 45min with the following fluorochrome conjugated antibodies: anti-CD3 BV786 (Clone SK7, BD Biosciences), anti-CD3 BV570 (Clone UCHT1, Biolegend), anti-CD45 Qdot800 (Clone HI30, Thermo Fisher Scientific), anti-CD19 PECy7 (Clone HIB19, Tonbo Biosciences), anti-CD21 BV711 (B-ly4, BD Biosciences), anti-CD21 PECy5 (B-ly4, BD Biosciences), anti-CD21 V450 (Clone B-ly4, BD Biosciences), anti-CD35 PE (E11, BD Biosciences), anti-CD35 FITC (E11, BD Biosciences), anti-CD38 BV650 (clone HIT2, Biolegend), anti-CD138 PE (clone 44F9, Miltenyi Biotec), anti-CD138 VB515 (clone 44F9, Miltenyi Biotec), anti-CD27 APC Cy7 (clone O323, Tonbo Biosciences), anti-IgG Alexa Fluor 700 (Clone G8-145, BD Biosciences), anti-IgM BV421 (Clone MHM-88, Biolegend), anti-IgA APC (Clone IS11-8E11, Miltenyi Biotec), anti-IgE BV480 (Clone G7-26, BD Biosciences), anti-IgD BV510 (Clone IA6-2, BD Biosciences), anti-CD69 PECy5 (Clone FN50, Biolegend), anti-CD80 BV711 (Clone 2D10, Biolegend), anti-CD86 Alexa Fluor 647 (Clone IT2.2, Biolegend), anti-PD1 PE-Dazzle594 (Clone EH12.2H7, Biolegend), anti-CD39 BV650 (Clone TU66, BD Biosciences), anti-CD59 PE (Clone H19, Biolegend), anti-CD269 PerCPCy5.5 (Clone 19F2, Biolegend), anti-XBP1S PE (Clone Q3-695, BD Biosciences), anti-IRF4 PerCPCy5 (Clone IRF4.3E4, Biolegend), anti-BLIMP1 Alexa Fluor 647 (Clone 6D3, BD Biosciences), anti-Ki67 FITC (Clone SolA15, Thermo Fisher Scientific).
For intracellular staining, cells were fixed and permeabilized using Transcription Factor Staining Buffer Set (eBioscience) following the manufacturer’s instructions prior to staining. Cells were washed in cold PBS with 2% FCS, filtered through a 70µm cell strainer and acquired using BD LSRFortessa or Cytek Aurora flow cytometer. Data were analyzed using FCS Express 6 Research Edition (DeNovo Softaware).
ELISpot assay
To assess the frequency of spontaneous antibody-secreting cells in the thymus of newborns or in cord blood, ELISpot was carried out following the manufacturer’s instructions with some modifications. Briefly, ELISpot plates (MSIPS4510, Millipore-Sigma) were coated with anti-human IgG (7.5µg ml-1, Mabtech), anti-human IgM (5µg ml-1, Mabtech), anti-human IgA (5µg ml-1, Mabtech) and anti-human IgE (5µg ml-1, Mabtech). After overnight coating and blocking with 1% FCS-PBS for 30min, total thymocytes, PBMCs from cord blood or sorted thymic B cells (CD19+CD21+CD35+ and CD19+CD21-CD35-) were plated for detection of total ASCs. After overnight incubation at 37oC, bound antibodies were detected using biotinylated anti-human IgG (1µg ml-1, Mabtech), anti-human IgM (1µg ml-1, Mabtech), anti-human IgA (1µg ml-1, Mabtech) and anti-human IgE (1µg ml-1, Mabtech). Spots were developed with ELISPOT Blue Color Module (R&D System) using streptavidin-conjugated alkaline phosphatase and 5-bromo-4-chloro-3-indolyl phosphate (BCIP) / nitro blue tetrazolium (NBT) as substrates. Spots were quantified using ELISpot Bioreader 6000 (BIOSYS).
Transmission electron microscopy
CD19+CD21+CD35+ and CD19+CD21-CD35- cells were sorted on a BDFACS Aria cell sorter and placed in 2% paraformaldehyde/2.5% glutaraldehyde in 0.1M sodium cacodylate fixative buffer, post-fixed with 1% osmium tetroxide followed by 2% uranyl acetate, dehydrated through a graded series of ethanol and embedded in LX112 resin (LADD Research Industries, Burlington, VT). Ultrathin sections were cut using a Leica Ultracut UC7 ultramicrotome (Leica Microsystems), stained with uranyl acetate followed by lead citrate. Grids were examined on a JEOL 1400EX transmission electron microscope at 120 kV. Images were acquired using Gatan Microscopy Suite Software version 2 (Gatan). Images were analyzed using ImageJ software version 1.52a (NIH, USA, https://imagej.nih.gov/ij/).
RNA Sequencing
Thymic B cells (CD19+CD21+CD35+ and CD19+CD21-CD35-), cord blood and adult blood CD19+B cells were sorted as mentioned above and place in lysis buffer (Qiagen). Total RNA extraction was performed RNAeasy Micro Kit (Qiagen) according to the manufacturer’s instructions. RNA QC was done using TapeStation Analysis Software version A.02.02 (Agilent Technologies). Library preparation was performed using the NEBNext Ultra RNA Library Preparation kit with PolyA selection workflow. Libraries were sequenced on a Illumina HiSeq 4000 using a 2x150bp Paired End lengths. Raw sequence data generated was converted into fastq files and de-multiplexed using Illumina's bcl2fastq version 2.17. Raw reads QC was performed using FASTQC. Reads were aligned using STAR aligner v2.5.2b to map the reads to the reference human genome. RNA sequencing data analysis was performed using R Studio version 3.6.0 56 with sva package for batch correction 57 and DESEeq2 package for differential expression analysis 58. Plots were generated using ggplot2 package 59. Gene set enrichment pathways analysis was performed using GSEA software 60,61.
Proteomics
Sorted CD19+CD21+CD35+ and CD19+CD21-CD35- cells were analyzed using in-StageTip (iST) method according to the manufacturer’s instructions 62. Peptides were lysed, digested, and submitted to MS/MS. The peptides were identified and analyzed using MaxQuant software package (available at http://maxquant.net/maxquant/) and the Perseus software platform was used for statistical analysis. Results are expressed as label-free quantification (LFQ) intensity. Statistical differences were assigned when p<0.05.
Single cell RNA sequencing
Single cell RNA sequencing of sorted CD21-CD35-CD19+ thymic B cells was conducted using Chromium Single Cell 3′ Reagent Kits v2 (10X Genomics) according to the manufacturer’s instructions. The libraries were quantified using KAPA hgDNA Quantification and QC Kit (Kapa Biosystems) and sequenced via NovaSeq 6000 (Illumina). Following the sequencing, the raw data from each sample were demultiplexed, aligned to the GRCh38-1.2.0. human reference genome, and UMI counts were quantified using the 10X Genomics Cell Ranger pipeline (v2.1.1, 10X Genomics). Data analysis was then continued with the filtered barcode matrix files using the Seurat package version 3.0 63 in R version 3.6.0 56. For the initial QC step, we filtered out the cells that expressed < 200 genes or > 4000 genes for both the donor samples, and any cell that expressed > 4% mitochondrial transcripts content, giving us 2,410 cells from donor 1 (4 days old) and 2,494 cells from donor 2 (4 months old). Gene expression values for each cell were log normalized and scaled by a factor of 10,000. The most variable genes for both donor samples were identified using the FindVariableFeatures function of the Seurat package, which were then used to integrate the two samples as described elsewhere 63. The integrated dataset with 4,904 cells was scaled, centered and used to run PCA analysis of the combined dataset. Based on the PCElbowPlot, we picked a certain number of principal components (PCs) for the clustering analysis when that number reached to the baseline of the standard deviation of PC. The cells were clustered into sub-populations using Seurat’s implementation of a shared nearest neighbor modularity optimization-based clustering algorithm (Louvain’s original algorithm). Cell clusters were visualized using UMAP 64. For differential gene expression, we used model-based analysis of single-cell Transcriptomics (MAST) test 65 (log fc ≥0.25) and only selected the genes with adjusting p-value <0.05 were used for further GSEA analysis (as described above). RNA velocity analysis was performed using Velocyto pipeline 32 and integrated into the Seurat analysis.
BCR Sequencing
CD21+CD35+, CD21-CD35- and CD21-CD35-CD138+ cell populations were sorted as described above. DNA extraction was performed using DNeasy Blood and Tissue Kit (Qiagen) following the manufacturer’s instructions and sample was eluted in TE buffer. IGHV sequencing was performed by Adaptive Biotechnologies using the ImmunoSEQ survey level assay. The assay uses 86 primers for the IGHV gene segment, 15 primers for the IGHD gene segment and 7 primers for the IGHJ gene segment 66. This generated a fragment capable of identifying the entire spectrum of unique VDJ combinations including functional genes, pseudogenes, and open reading frames. Next, amplicons were sequenced using the Illumina HiSeq platform. The resulting 130 bp sequences permitted inference of the corresponding germline sequences 66, and are denoted IGH–VDJ transcripts. A suite of custom algorithms has been developed by Adaptive Biotechnologies to verify, collapse, align and catalog the CDR3 sequences. To assess and remove PCR bias from the multiplex PCR assay, a synthetic immune system with all possible V–J combinations was precisely quantitated as described elsewhere 67. The data were subsequently analyzed and visualized using the ImmunoSEQ analyzer provide by Adaptive Biotechnologies.
Recombinant antibody cloning and expression
CD19+CD21+CD35+, CD19+CD21-CD35- and CD19+CD21-CD35-CD138+ thymic cells were sorted as previously mentioned into 384-well plates containing hypotonic lysis buffer 68,69 containing 10nM TRIS and 0.75units ml-1 of RNASin plus (Qiagen) at 4oC and stored at -80oC. Next, cDNA was generated using the High-capacity cDNA generation kit (Applied Biosystems) following the manufacturer’s instructions. Multiplexed PCR was performed using primers specific for IGHV, IGKV, IGHC, IGKC, and IGLC gene segments. Resulting amplicons were used as templates for semi-nested PCR to isolate the heavy and light chain genes and to incorporate modifications to allow ligation independent cloning into expression vectors 70. Resulting amplicons were inserted into mammalian expression plasmids containing the IGG1 IGKC, or IGLC gene sequence. Recombinant antibodies were then generated by co-transfection of plasmids encoding Ig heavy and light chain pairs into 293FS cells (Invitrogen) using standard polyethylenimine transfection methods 71.
Quantitation of Ig clone supernatants
The concentration of IgG in the supernatant of 293 cells transfected with plasmids encoding Ig heavy and light chain pairs was assessed by ELISA using the Human IgG ELISA Quantitation kit (E80-104, Bethyl Laboratories) following the manufacturer’s instructions. Optical density was read at 450nm using BioTek Synergy H1 plate reader. Concentration was calculated based on the standard curve for human reference serum and expressed in ng per ml.
Bacterial culture
Staphylococcus aureus GP22, Escherichia coliform (ATCC 25922), Klebsiella pneumoniae KP3572, were picked from plated colonies and grown in an overnight culture of 2mL TSB at 37°C. Haemophilus influenzae (ATCC 19418) was grown in a lawn on chocolate plates overnight at 37°C. Bacterial cultures were sub-cultured or re-suspended in TSB until the OD was approximately 0.35 and 10mL of these suspensions were centrifuged at 4,000 x g at 4°C. Pellets were re-suspended in TSB at 109 CFU ml-1.
Bacterial reactivity of IgG clone supernatants
IgG clone supernatants were assessed for reactivity to four relevant bacteria (see above). Briefly, each IgG clone supernatant was incubated with each bacterial culture for 30min at 37oC in 96 well cell culture plates (Corning Incorporated). After washing, bacteria were incubated with FITC-conjugated anti-human IgG (Fisher Thermo Scientific) for 30min at RT. After washing, bacteria were fixed in 10% formalin and acquired on a BD LSRFortessa flow cytometer with high-throughput sampling capabilities. Flow cytometry file data were analyzed as described elsewhere in this paper.
Immunofluorescence of paraffin-embedded thymus sections
Tissue sections were deparaffinized in xylene for 10 min, washed with 100% ethanol followed by 95%, 80%, 70% and 50% ethanol, and then rinsed in distilled water. Samples were processed for antigen retrieval, blocking and staining following the Opal Multiplex IHC protocol (PerkinElmer) as described elsewhere73. Anti-CD19 (clone BT51E, NCL-L-CD19-163, Leica Biosystems), anti-CD31 (clone C31.3 + JC/70A, ab199012, Abcam), anti-cytokeratin (clone PCK-26, ab6401, Abcam) and anti-CD138 (clone MI15, PA0088, Leica Biosystems) were titrated and used as primary antibodies. Finally, after DAPI staining, slides were mounted with VECTASHIELD antifade mounting media (Vector Laboratories). Images were taken using Vectra 3.0 Automated Quantitative Pathology Imaging System (PerkinElmer) and inForm® cell analysis software (PerkinElmer). Images were evaluated and validated by an experienced pathologist.
Statistical analysis
The results are presented as mean ± SD and/or normalized z-score unless otherwise specified in the figure legends. Population size is described in the figure legend. All the statistical analyses were performed using GraphPad Prism software version 7.0. Differences were considered statistically significant when p<0.05 using paired/unpair T-test. For RNA Seq, differences were considered statistically significant when Benjamini-Hochberg adjusted p-value > 0.05 and absolute log2 fold change > 1. Results of statistical tests are listed in Supplementary Tables 1, 2, 3, 4, and 5.
Data availability
The processed data and transcriptome datasets for both RNA-Seq and Single-Cell RNA-Seq generated during this study are available on NCBI GEO with the accession number: GSE152453 and GSE153117, respectively. DNA sequencing data used for BCR repertoire analysis can be accessed at Adaptive Biotechnology ImmunoSEQ Analyzer.
Code availability
Customized scripts generated for this study are available from the corresponding author on request.