Cell purity and viability assessments
The purity of the CD4+ cells from both tissues was ~95% (Supplementary Figure S1-3, Additional File 1). The CD8+ populations displayed more varying purity scores. The thymic CD8+ T cells achieved ~95% purity, using negative enrichment. (Supplementary Figure S4, Additional File 1). The positive selection assay for CD8a used on peripheral blood, performed better in adult than infant blood, with purity scores at 95% and 75%, respectively (Supplementary Figure S6-5, Additional File 1). Staining the CD8a+ cells after sorting, with CD3 we found that > 90% of the CD8 T cells were CD3+ (Supplementary Figure S7, Additional File 1), suggesting that a small portion of the CD8a+ cells could be NK, immature thymocytes or other CD8a+CD3- cells. CD3+ NKT cells may be present, however in supposedly small numbers as NKT cells constitute 1% of all peripheral blood T cells (18). We detected suspected double positive CD4CD8+ thymocytes in the CD4+ thymocyte population (Supplementary Figure S1, Additional File 1), and vice versa (about 10%) (Supplementary Figure S4, Additional File 1). In the infant blood, we observed 2% CD4+ cells in the CD8+ population (Supplementary Figure S5, Additional File 1), while in adult blood we observed 5% CD4+ cells in the CD8+ population (Supplementary Figure S6, Additional File 1). We also found traces of CD8+ T cells in the isolated CD4+ T cells. This was seen, to a less extent, in CD4+ adult blood (~2% CD8+ cells, Supplementary Figure S3, Additional File 1). The viability differed between sample subsets. The thymic samples had a higher average viability (88%) than blood (77%) for CD4+ T cells, while the average viability of CD8+ cells was 63% from thymus and 71% from blood (data not shown).
Descriptive statistics
Figure 1 provides a graphical overview of the experimental design and workflow. For the SP CD4+ and CD8+ T cells from infant thymus and blood, we used 3-5 biological replicates (ages 5 days – 15 months), while peripheral blood CD4+ and CD8+ T cells from adults were pooled from five individuals (23-45 years). From all 18 transcriptome profiles generated, the sequencing depth ranged from 69-122 M reads (Supplementary Table S1, Additional File 2). However, particularly the sequencing data from the CD8+ T cells contained a considerable proportion of multimapping reads (28-86%). Yet, after excluding multimapping reads from further analysis, satisfactory estimated library sizes for detecting DE genes (>10 M) (19), remained for 14 out of 18 samples (range: 4 - 67 M, median: 49 M).
The thymic and peripheral blood T cell transcriptome
RNA-seq of human CD4+ and CD8+ T cells, derived from infant thymus, as well as from infant and adult peripheral blood, detected 44,282 known coding transcripts (Figure 2A). In addition, 19116 potentially novel alternative transcripts, 242 novel long non-coding RNA (lncRNA) and 153 novel transcripts of uncertain coding potential (TUCP) were also uncovered. The novel alternative transcripts displayed the largest range in number of exons, with 26.5% of the transcripts exceeding 20 exons (Supplementary Figure S1A, Additional File 3), showed a high coding probability (median 0.99, Supplementary Figure S1B, Additional File 3), and comprised the longest transcripts, with 30% exceeding 10kb (Supplementary Figure S1C, Additional File 3). The median coding probability was high also for the generally shorter TUCP (0.67), while it was very low (0.004) for the novel lncRNA. Both TUCP and lncRNA had a median of two exons. Investigating thymic SP T cells exclusively, 39,965 known transcripts, 20,764 potentially novel alternative transcripts, 252 potentially novel lncRNA and 171 transcripts of uncertain coding potential (Supplementary Figure S1D, Additional File 3) were detected. Infant CD4+ T cells of blood and thymic origin presented similar numbers of detected transcripts, while for the CD8+ T cells, the infant blood derived displayed ~30% less transcripts than the thymic T cells (Table 1). The adult blood derived transcripts were consistently the least abundant.
Genes expressed in T cells from human thymus and blood
RNA-seq of the primary T cell subsets from human thymus and blood identified transcripts from 18,218 known genes in total, after filtering low expressed genes (<1 pr million counts) (Supplementary Figure S2, Additional File 3). 14,441 (79%) were protein coding (representing 61% of Ensembl protein coding genes), 2501 lncRNA, 944 pseudogenes and 332 non-coding RNA (ncRNA). A multidimensional scaling (MDS) plot of the transcriptomes (Figure 2B), revealed that the samples were separated by tissue in the first dimension and by cell type in the second dimension. Both thymic SP CD4+ (Figure 2C) and CD8+ T cells (Figure 2D) showed more uniquely expressed genes (average gene expression FPKM>2 for the replicates) than the blood derived T cells from infants or adults. A higher number of expressed genes were shared between thymic CD4+ and thymic CD8+ T cells, than between infant blood vs thymic T cells of the same cell population (Supplementary Figure S3A, Additional File 3). This pattern was also true for genes associated with autoimmune diseases (Supplementary Figure S3B, Additional File 3).
Genes associated with autoimmune diseases
Of 555 loci associated with autoimmune diseases (AID; GWAS catalogue Nov 2015, P<5x10-8), the majority were expressed in our T cell datasets. Only 123 (22.2 %) of the annotated genes were not detected (at FPKM >= 2) in neither CD4+ nor CD8+ T cells from any of the three origins, while more than half of the genes (N=285) were expressed in both T cell populations from all sample types (Supplementary Table S2, Additional File 2). The proportion of AID genes expressed varied across our T cell populations and between the diseases (Figure 3). For the AIDs we investigated, at least half of the identified risk genes were found to be expressed. Observing the T cell populations separately, 378 of AID associated genes were expressed by CD4+ of any origin and 421 genes were expressed by CD8+ of any origin (Supplementary Figure S3C-D, Additional File 3). Interestingly, 49 of the 432 expressed AID genes were not expressed in T cells from adult blood (Supplementary Table S2, Additional File 2). Of these 18 AID risk genes were only expressed in thymic SP T cells while 20 AID risk genes were only detected in peripheral T cells from children. These 49 loci were mainly associated with inflammatory bowel disease (N=21), multiple sclerosis (N=18), rheumatoid arthritis (N=15) and type 1 diabetes (N=10).
Differential expression was most pronounced between thymus and blood
In both CD4+ and CD8+ T cells, the largest number of differentially expressed genes (DEGs) was discovered when comparing T cells from thymus with infant blood, followed by adult blood (Table 2). Comparing infant with adult blood T cells provided less DEGs. Similarly, when comparing the transcriptomes of CD4+ with CD8+ T cells, from different origins (Table 2), the highest numbers of DEGs were observed between the two T cell subpopulations in thymus, followed by infant blood, and lastly, adult blood. Volcano plots of DEGs for the pairwise comparisons are shown in Supplementary Figure S4 (Additional File 3), and complete lists of DEGs with expression values for all samples are found in Supplementary Tables S3-11 (Additional File 2).
Clustering the, in total, 5925 DEGs from all comparisons, revealed that the subsets clustered according to tissue of origin, then cell type and age – with one major clade for the thymic cells and one major clade for the blood derived cells (Supplementary Figure S5, Additional File 3). Genes associated with V(D)J recombination and T cell commitment, including RAG2, HES1 and DNTT, were amongst the top 10 DEGs upregulated in thymic T cells (Figure 4A). In CD8+ infant and adult blood T cells, the top upregulated genes included genes involved in cell migration and lineage commitment; S1PR5, PLEKHG3, and TBX21, while, amongst others, interleukin receptors IL6R and IL4R displayed high expression in CD4+ infant and adult peripheral blood T cells.
Differences in gene set enrichment profiles related to developmental stage
The upregulated DEGs in thymic SP CD4+ and CD8+ T cells, were mainly involved in cell division and proliferation, when compared to infant blood CD4+ and CD8+ T cells (Figure 5A). The DEGs upregulated in infant blood CD4+ and CD8+, compared to the equivalent thymic subset, were enriched for multiple immune related biological processes, such as defense response, cytokine production, and intercellular signal transduction, as well as regulation of cell proliferation and differentiation. When comparing infant to adult blood T cells (Figure 5B), the infant blood T cells were enriched for genes involved in proliferation and cell death, besides regulation of gene expression and immune system processes. The genes upregulated in adult blood T cells were engaged in response to stimulus, immune and defense response, cytokine production and biological adhesion. Comparing CD4+ to CD8+ T cells, of the same tissue and age, revealed that genes upregulated in thymic CD4+ T cells were heavily involved in chromosome organization and cell cycle, while enriched GO terms in CD8+ T cells in infant blood, were dominated by immune related processes (Supplementary Figure S6, Additional File 3).
T cell markers for egress, differentiation and migration
Since we have a unique material of primary T cells from both thymic and blood from infants, we looked specifically at the expression patterns of genes involved in T cell egress (Figure 6A), migration and differentiation. In general, the CD4+ T cells expressed a wider repertoire of PTPRC transcripts than CD8+ T cells (Figure 6B). In peripheral blood, the adults showed higher expression of CD45RO transcripts (PTPRC-201) in their CD4+ T cells than children, while the opposite was observed for the CD45RABC isoform (PTPRC-209). The isoform patterns of CD45 have been less well characterized in CD8+ T cells. We observed tentative novel isoforms (Figure 6C I and II), sharing exons with CD45RABC, in CD8+ T cells, not found to be expressed in CD4+ T cells. In the CD8+ cells, these novel PTPCR transcripts were expressed at similar levels as CD45RABC and CD45RO. We also observed that the CD45RB transcripts (PTPRC 203 and 214) displayed higher expression in the peripheral blood CD4+ T cells than the SP CD4+ T cells in the thymus, yet compared to the RO and the RABC isoforms, overall expression was low.
We furthermore investigated the CD45RA/RO ratios of the CD4 T cells, at the surface protein level using FACS, comparing a thymic sample and blood from the same child, and blood samples from two adults aged 30 and 70 years (Supplementary Figure S8, Additional File 1). Like others (5, 20), we observed high amounts of CD45RO in the thymic sample, while the blood sample, from the same individual, displayed less CD45RO and more CD45RA positive cells. Both the adult samples, regardless of age, showed extensive co-expression of CD45RA and CD45RO (43-51%, Supplementary Figure S8, Additional File 1), yet the overall expression of CD45RA was low, compared to infant blood. The higher CD45RA expression in infants compared to adults is likely due to a higher proportion of naïve T cells.
Our data suggests that infant CD8+ T cells may express CD8B at a higher level than CD8A, while the opposite was seen in the adult pool of CD8+ T cells (Figure 6D), though the difference was not statistically significant. The expression levels of CD8A and CD8B in the SP thymic T cells were equivalent. We explored the distribution of CD8B isoforms, and detected highest expression of CD8b-201 (ENST00000331469) in SP thymic CD8+ T cells, followed by the blood CD8+ T cells from adults and infants (Supplementary Figure S7, Additional File 3). The most abundant isoform was CD8b-203 (ENST00000390655), mainly expressed by the CD8+ mature thymocytes, followed by the infant blood T cells, and to a lesser degree in adult CD8+ T cells.
To further investigate differentially expressed genes involved in T cell differentiation and migration, we extracted DEGs associated with the GO terms “lymphocyte migration” (GO:0072676) and “T cell differentiation” (GO:0030217), as well as relevant genes from the literature (Figure 4B). The genes upregulated in thymic T cells included recombination-activating genes; RAG1 and RAG2, genes involved in adhesion and homing; ITGAE (CD103) and CCR9, T lineage commitment; SATB1, cell proliferation; MKI67 and transcriptional regulators involved in T cell development; ID2, SOX4, LEF1 and BCL6. In adult blood T cells, several chemokines, interleukins, and their receptors were upregulated; CCL5 (RANTES), IL12RB1, IL10RA, IL32, CCR2 and CCR5, as well as genes involved in cell adhesion and migration; ADAM8, ITGB7, SELPLG, and lymphocyte function and activation, including SLAMF6, PIK3CD, TXK and NFATC2. Several genes involved in cell adhesion and lymphocyte homing, migration, egress and maturation were upregulated in infant blood T cells; CD69, CD44, SELL (CD62L), CCR7, S1PR1, ITGA6, ITGA5, ITK and TESPA1.