Genetically balanced hESC lines differentiate into RPE cell populations with unique gene-expression signatures containing astrocytes and pigmented ciliary body cells
In this work, we differentiated in-house genetically balanced hESC lines into RPE using an unguided differentiation protocol30(Supplementary Table S1 and Supplementary Figure S1). After 4 to 8 weeks, pigmented areas appeared in the dish, indicating clusters of cells that differentiated spontaneously into RPE. These clusters were picked and passaged to yield a pure RPE cell culture (Fig. 1a and 1b). For this study, we were first interested in establishing if these clusters are of clonal origin, as this would generate a genetic bottleneck, the size of which we could modulate depending on the number of clones picked to establish the final RPE culture. For this, we differentiated an hESC line labelled with the RGB lineage tracing system31. Eight weeks into differentiation, we found that the pigmented clusters of cells were predominantly of clonal origin, as indicated by the expression of the same color within the cluster (Fig. 1c). We next differentiated five genetically balanced hESC lines to RPE, and purified the cultures through colony picking and subsequent passaging. Following two passages, the cultures consisted of homogeneous layers of RPE cells, characterized by their typical cobblestone morphology (Fig. 1d). The cultures were stained for markers of neuroectoderm (PAX6), pluripotency (NANOG) and RPE (PMEL, ZO-1 and BEST1 in Fig. 1e, all lines in Supplementary Figure S2). All RPE cultures were fully negative for NANOG while ZO-1 and PMEL were expressed by all cells. All cultures presented two cell populations, PAX6high/ BEST1low and PAX6low/BEST1high, reflecting different levels of maturation, in line with other studies32,33.
We carried out single-cell RNA sequencing (scRNAseq) for the five hESC-derived RPE cultures, which yielded good quality results for 87168 individual cells, and utilized previously generated scRNAseq data of one hESC line as a reference for the undifferentiated state (N = 1996). The UMAP of all RPE cells shows that each line predominantly remained in its own cluster, with four small clusters originating from multiple cell lines (Fig. 1f). We first assessed the gene expression of RPE cells relative to that of undifferentiated hESC, to test the induction of RPE-related genes and residual pluripotency (Supplementary Figure S3a). We found that the five RPE cultures expressed pigment synthesis, visual cycle, phagocytosis, ion channel and tight junction-associated gene sets (Fig. 1g, Supplementary Table S234). We next integrated our dataset into that of Senabouth et al, that includes scRNAseq of RPE derived from 79 different iPSC lines (Fig. 1h32). The results show that our cells express RPE markers at similar levels to those reported in this large study. In addition, and consistent with the PAX6/BEST1 immunostaining (Fig. 1e), the UMAPs show the presence of two distinct populations of RPE based on their maturation state in both datasets. Importantly, no RPE cells exhibited residual undifferentiated state gene expression (Fig. 1g and Fig. 1h). Taken together, cell morphology, immunostaining results and gene-expression profiling all indicate that our cells were efficiently differentiated into RPE cells.
In the next step of transcriptomic analysis, we analyzed the differences in gene-expression between the clusters of RPE cells originating from different hESC lines. We carried out differential gene expression analysis of each cluster against the other four and identified a set of 1206 genes that drive the differences between lines. The Venn diagram in Fig. 1i shows that the RPE derived from each hESC line expresses a specific set of genes, with few overlaps and each line having between 12 and 32% unique deregulated genes. The heatmap of differentially expressed genes in Fig. 1j illustrates that each cell line has its own transcriptomic signature, with some sets of genes being specifically higher or lower expressed per cell line. Gene set enrichment analysis for each of these gene sets revealed no enrichments for pathways related to RPE functionality or stem cell differentiation (Supplementary Table S3). Plotting the cells in a UMAP after excluding the genes driving the cell-line specific differences reduced the distance between the line-specific clusters, while the small clusters composed of cells from different cell lines remained apparent (Fig. 1k). The observed differences between the RPE are likely of minor significance and may be due to differences in the genetic and epigenetic landscape of individual cell lines35,36.
Next, we focused on determining the identity of the four clusters composed of cells from different lines and that had transcriptomic signatures that differed from RPE cells (Fig. 1l). Cluster 2 represents 1.32% of the cells and contains RPE cells originating from all 5 hESC cell lines. This cluster shows a high expression of proliferation marker MKI67, and cell cycle state analysis showed that 47% of these cells were in G2/M and 53% in S phase, indicating that they are proliferating cells (Fig. 1m, Supplementary Figure S3b and S3c). Using UCell scoring of Seurat, we assigned to each cluster a score related to different neural and ocular cell types (Supplementary Table S437). The RPE clusters of VUB02, VUB04, VUB07, VUB14 and VUB32, as well as cluster 2, show the highest scores for the RPE gene set. Clusters 1 (1.40% of cells, originating from the 5 lines), 3 (0.63% of cells, originating from 3 lines) and 4 (0.29% of cells, originating from the 5 lines) also express RPE related genes, but at lower levels than the main RPE clusters. Conversely, clusters 1 and 3 show high expression of astrocyte markers, and cluster 3 also shows high scores for Schwann and Mueller glia cells. Finally, cluster 4 shows expression of genes marking pigmented ciliary body cells, a cell type related to the RPE lineage (Fig. 1n, Supplementary Figure S3d,).
RPE differentiation is a genetic bottleneck eliminating aneuploid cells except for cells with gains of 1q
We next investigated the genetic composition of the cell cultures at the single-cell level both before and following RPE differentiation. We performed single-cell DNA sequencing (scDNAseq) on three of the five hESC lines, at the start of the differentiation process. We were unable to perform scDNAseq on the two remaining lines and on the differentiated cells due to the discontinuation of the essential 10x Genomics products required for this specific analysis. Therefore, we employed the inferCNV algorithm on the scRNAseq data obtained from RPE cells derived from all five hESC lines to assess the genetic content at the endpoint of differentiation.
The analysis yielded high-quality scDNAseq results for a total of 1668 cells, including 954 cells from VUB02, 274 cells from VUB04, and 440 cells from VUB07. Their karyotypes are illustrated in Fig. 2a-c. Overall, single cells could be broadly categorized into different groups based on their genetic content. These groups included cells with genetically balanced content, cells with chaotic genetic content (defined as multiple abnormalities spanning more than 10% of the genome), and cells with gains or losses of other sizes. We further subdivided the abnormal non-chaotic cells based on whether the CNVs were of a size detectable by inferCNV on scRNAseq, with a detection threshold of 20 Mb. This subdivision was introduced to facilitate comparisons between the outcomes of these two distinct analyses. Figure 2d provides an illustrative representation of diverse cell types, including chaotic cells, cells with gains in chromosome 1q, cells with isochromosome 20, and cells with segmental gains in chromosomes 15q and 20q.
On average, 95% of cells were genetically balanced, with only minor variations observed between the three hESC lines (Fig. 2e). Interestingly, cells with chaotic genetic content were found in all three lines. These chaotic cells were similar to those found in human preimplantation embryos, where they originate from abnormal mitotic events with multipolar spindles38–40. To investigate whether similar abnormal cell divisions occurred in hESC, we conducted live-cell imaging for 24 hours to visualize DNA and microtubules (Supplementary Video 1). Figure 2f shows two still images depicting the progression of a cell through tripolar spindle mitosis, with additional examples available in the supplementary video. This evidence confirmed that hESC undergo abnormal multipolar mitosis, resulting in daughter cells with chaotic genetic content as seen in scDNAseq.
The distribution and location of all gains and losses found in the cells (excluding chaotic cells) are depicted in Fig. 2g, and a comprehensive list of breakpoints is provided in Supplementary Table S5. While cells with a single loss were twice as common as cells with a single gain, overall, gains outnumbered losses (68 gains vs. 40 losses). Notably, all three hESC lines contained single cells with genetic abnormalities known to confer a growth advantage, potentially leading to culture dominance by cell competition. VUB02, for instance, included cells with a trisomy 12, a trisomy 20, a dup(1)(q32.1q44), and a del(18)(q21.2q22.1), whereas VUB04 had cells with a dup(1)(q21.3q44) and an isochromosome 20. VUB07 contained cells with a dup(20)(q11.21q13.2).
Remarkably, the inferCNV analysis at the endpoint of RPE differentiation revealed that, in contrast to undifferentiated hESC, the only genetic imbalance we identified in RPE cells was represented by different-sized gains of chromosome 1q, which were present in VUB04 and VUB07 and distributed across the cell clusters (Figs. 2h-l). VUB04 exhibited 42.8% of RPE cells with a 77.8 Mb gain of 1q21.3q44 and 2.95% of cells with a 20.4 Mb gain of 1q21.2q24.2. For VUB07, 2.66% of the cells displayed an 82.0 Mb gain of 1q21.3q44. Additionally, the clusters representing cells mis-specified as astrocyte and ciliary body-like cells contained 12.8% of cells with the 77.8 and 82.0Mb 1q gain and 28.7% of cells with the shorter 20.4 Mb 1q gain. Notably, cells with 1q gain showed no preference for allocation to the mis-specified cells, suggesting that this aneuploidy may not be the primary driver for their appearance (Fig. 2m).
Our results revealed a remarkable absence of any other aneuploidies in the RPE, which was widely different from undifferentiated hESC. Consequently, we conducted a probability calculation to determine the likelihood that this observation is the result of randomly picking only genetically balanced RPE colonies. To obtain pure RPE cultures, we pick approximately 60 individual colonies per line, which appear to be predominantly clonal in origin (Fig. 1c). Assuming that all aneuploid cells have equal probabilities of successfully differentiating to RPE and have no competitive advantage over genetically balanced cells, and based on the aneuploidies we found in the undifferentiated cells by scDNAseq, the chances of picking at least one aneuploid clone that would have shown in the inferCNV analysis are 47.8% for VUB02, 96% for VUB04 and 89.7% for VUB07. Considering these probabilities, the absence of aneuploidies other than the gain of 1q in RPE cells suggests that not only the differentiation to RPE represents a bottleneck with a strong selective pressure against most aneuploid cells, but that the gain of 1q may be conferring a culture advantage to the cells during this differentiation process.
Gain of 1q results in discreet transcriptional changes in RPE affecting the apoptosis pathway and eye disease-related genes
Since a large proportion of the RPE cells carried a large gain of 1q (RPE1q), we aimed to investigate the potential differences in the transcriptome of individual RPE1q in comparison to their euploid counterparts (RPEwt). Cells with a smaller gain of 1q were excluded from this analysis due to their limited representation in the dataset.
We started by assessing the expression of a comprehensive set of genes typically associated with RPE cells. This initial analysis shows that RPE1q consistently express all expected marker genes (Fig. 3a). The extent of their expression varied, and while all genes except MITF showed statistically significant differences between the two groups, some genes were expressed at higher levels while others at lower levels in RPE1q cells. This suggests that these cells differentiated to bona fide RPE, similarly to the genetically balanced cells.
To gain further insights into the potential impact of the 1q gain on RPE cells, we conducted single cell Gene Set Variation Analysis (scGSVA) focused on the KEGG pathways. Four illustrative examples of pathway outcomes are presented in Fig. 3b (the remaining data is available in the Supplementary Datafile 1). It is noteworthy that although many of these pathways reached high statistical significance, the differences in normalized enrichment scores’ mean were minimal. We sought to explore whether the observed differences in gene expression were consistently linked to the presence of the 1q gain, or if they were influenced by inherent variations between cell lines. For this, we conducted scGSVA analyses on individual cell lines, as well as a comparative analysis of RPE1q and RPEwt cells in VUB04 and VUB07, as shown in Fig. 3c (the remaining data is available in the Supplementary Datafile 1).
In this more detailed comparison, we looked for consistent differences between the two groups, particularly in the context of KEGG pathways. However, it became evident that except for the apoptosis library their profiles did not consistently diverge. These differences were not consistently statistically significant, and the enrichment scores frequently shifted in opposite directions. The cumulative view encompassing all cell lines suggested that the observed variations were primarily attributable to line-to-line variation rather than being directly associated with the presence of the 1q gain.
The apoptosis pathway appeared as an exception, with consistent enrichment scores in the two isogenic pairs of VUB04 and VUB07. This consistency prompted us to look further into the roles of the differentially expressed genes within the apoptosis pathway (Fig. 3d). Our analysis revealed that only a subset of genes in the apoptosis pathway were expressed in RPE cells, and these genes did not all consistently differ between RPE1q and RPEwt cells. Interestingly, three genes involved in the execution of the apoptotic process, PARP1, MCL1 and LMNA, were significantly upregulated in RPE1q cells. These three genes are all located within the region affected by the 1q gain, implying that their increased copy number likely contributed to their elevated expression and therefore to the deregulation of the broader apoptosis pathway. Given that these effectors often have opposing roles, predicting their overall biological impact remains challenging.
Lastly, we extended our analysis by exploring the disease ontology of our gene set in the context of human diseases relevant to RPE cells. Four sets of diseases were selected for analysis (Fig. 3e). The results indicated that RPE cells with a gain of 1q exhibited differential gene expression patterns related to eye diseases, retinal degeneration, eye degenerative diseases, and degeneration of the macula and posterior pole. Upon closer examination, we identified six genes that consistently and significantly differed between RPE1q and RPEwt (Fig. 3f and supplementary table 8). COL18A1 and CYBA exhibited lower expression in RPE1q, while TMCO1, ATP6AP2, RPE65, and HTRA1 displayed higher expression. We explored the published roles of these genes in RPE function and found that their expression pattern did not suggest dysfunction or poor differentiation in RPE1q cells. While downregulation of COL18A1 would be detrimental for RPE 41, downregulation of CYBA42 and upregulation of TMCO1 43, ATP6AP2 44, RPE6545 and HTRA146 would be beneficial for the RPE function and survival.
Co-culture with genetically balanced cells cannot rescue the impaired RPE differentiation of aneuploid hESC except in cells with gain of 1q
In the subsequent experiments, we aimed to test the hypothesis that low-grade mosaic aneuploid hESC observed in genetically normal cultures do not contribute to the differentiated progeny due to their inability to undergo RPE differentiation. For this, we used three wt control lines and nine hESC lines with different aneuploidies recurrently present in hPSCs to undergo RPE differentiation. The differentiations were conducted in 1–7 independent differentiation experiments (the list of lines and replicates can be found in Supplementary Table S6).
After a 12-week differentiation period, we found that only the line with the smallest 1q gain (VUB031q32.21) had pigmented areas characteristic of RPE differentiation, while two hESC lines with a larger 1q gain (VUB031q21.1qter and VUB011q21.1qter) showed no visible pigmentation. Similarly, the lines with gain of 20q11.21, isochromosome 20q, gain of 17q, and derivative chromosomes involving the loss of 18q consistently failed to differentiate into RPE (Fig. 4a shows a representative example, additional images can be found in Supplementary figure S4). At the colony picking stage for RPE purification, we analyzed the mRNA expression levels of RPE markers RP65, BEST1, PAX6, and the undifferentiated state marker POU5F1. The results indicated that genetically balanced control lines induced the RPE gene-expression signature, whereas aneuploid cell lines either did not initiate the expression of these genes or did so significantly less than control cells (Fig. 4b). Collectively, these findings demonstrated that, except for the small focal gain of 1q32.21, the tested genetic abnormalities impeded the correct differentiation of hESC into RPE cells.
Given that in our first experiments we had identified correctly differentiated RPE harboring large gains of the 1q arm mixed into the genetically normal cells, we decided to carry out the differentiation of the 1q cells in co-culture with genetically balanced cells, which we hypothesized would promote the correct differentiation of the 1q cells. In the co-culture experiments, we introduced between 0.2 to 0.5% of two fluorescently labeled hESC lines with a gain of 1q alongside genetically balanced isogenic cells (N = 6 replicates). The differentiation progression was monitored by regular live fluorescent confocal imaging, which showed that 1q cells rapidly started outcompeting their genetically balanced counterparts in the dish (full dish images are shown in Fig. 4c). Pigmented colonies were picked at 12 weeks after initiation of differentiation, expanded, and passaged, and we used flow cytometry to quantify and to isolate the RPE1q cells. The results show that at the endpoint of differentiation, the cell population with a gain of 1q had increased between 45 to 124-fold, accounting for 9%-24.9% of the differentiated cells. Moreover, these RPE cells expressed the RPE markers ZO-1, PMEL, PAX6 and BEST1 similarly to control cells (Fig. 4f), and were harvested for bulk RNA sequencing, along with the RPE obtained from VUB031q32.21 (shown in Figure S4).
Finally, we tested whether genetically balanced cells could rescue the differentiation of aneuploid hESC other than those with 1q gain. We mixed 1% of GFP-expressing hESC with a gain of 20q11.21 or with an isochromosome 20q with its genetically balanced and unlabelled counterpart and subjected them to RPE differentiation. We regularly imaged the cells and observed a steady increase in fluorescent signal until the pigmented colonies started appearing, after which the signal decreased (Fig. 4d). This suggested that the fraction of 20q11.21 and isochromosome 20q cells initially increased during the first part of spontaneous differentiation to then decrease upon RPE specification. We collected an entire dish, prior to RPE colony picking and tested for the presence of cells with a gain of 20q using a copy number assay. This revealed that in both cell competitions the mutant cells had taken over most of the dish, while silencing the fluorescent protein construct. We purified the RPE by colony picking and cultured until obtaining a homogeneous population. The cells were stained for ZO-1 and PMEL, and co-imaged for GFP to identify the 20q11.21 or isochromosome 20q cells (Fig. 4e). The results show that, respectively, 3 and 5% of the RPE cells consists of GFP+ aneuploid cells and that they do not express PMEL in the same manner as their genetically balanced counterparts, and that they represent a contamination of the pure RPE cell population derived from the control cells. Taken together, this suggests that both 20q11.21 and isochromosome 20q cells overgrow the genetically balanced cells during spontaneous differentiation, but do not specify to RPE but to an alternate cell fate.
RPE1q show transcriptomic signatures of aneuploidy-related stress
We next studied by bulk RNA sequencing the transcriptome of the three RPE1q cell lines we isolated in the previous experiments and compared it to that of RPE from control cells. We included triplicates for each RPE1q line (N = 9 samples) and ten control samples (5 hESC lines, 2 replicates per line). The principal component analysis (PCA) (Fig. 5a) revealed a closer clustering pattern among RPE1q samples, suggesting potential differences in their transcriptomic profiles. We first focused on examining the induction patterns of RPE markers in both groups compared to undifferentiated hESC. The results, shown in Fig. 5c, showed similar induction for the RPE differentiation genes, along with the simultaneous downregulation of pluripotent-state markers, in both control and 1q cells. This was in line with their similar RPE morphology and the stainings shown in Fig. 4d and confirmed that hESC1q do have the ability to generate RPE cells similarly to their genetically balanced counterparts, though only when in co-culture with them.
We further explored potential differences between the two groups by carrying out differential gene expression. The results show that RPE1q have higher expression of 372 genes and lower expression of 1183 genes, as compared to their RPEwt counterparts (Fig. 5b). Gene set enrichment analysis (GSEA) showed that the RPE1q cells had, in line with the findings in the scRNAseq, deregulation of the apoptosis pathway (Fig. 5d, Supplementary Datafile 2 and 3). Other pathways that were enriched were the unfolded protein response, DNA repair and p53 pathways. These pathways are known to associate to intracellular stress response due to aneuploidy47–50, and may contributing to the deregulation of apoptosis-related genes we see here and in the scRNAseq data. It is interesting to note the negative enrichment score for the epithelial-to-mesenchymal transition (EMT) genes, an important process in RPE maturation33,51. RPE1q also showed an overall diminished expression of collagen-related genes, including genes not only the collagen genes themselves but also genes involved in their biosynthesis, trimerization and degradation (Fig. 5e, Supplementary Datafile 3).
Co-culture of 1q and wt cells rescues the differentiation trajectory within two days after growth factor removal
We hypothesized that the impact of the genetically balanced cells on differentiation of 1q cells when in co-culture would be detectable in the first days after growth factor withdrawal, as cell fate decisions are taken very early in the differentiation process. To test our hypothesis, we carried out scRNA sequencing of undifferentiated and spontaneously differentiating cells (day 1 and day 2), including a wt hESC line and its 1q-mutant counterpart, on their own and in a 1:9 1q:wt co-culture. We first controlled the karyotypes of the cells by inferCNV, and found that a small fraction of the wt cells carried a trisomy 20, and consequently the co-culture contained wt, 1q and trisomy 20 cells (Supplementary figure S5). Seurat analysis yielded 7 clusters (Fig. 6a). Differential gene expression analysis showed that this early after FGF2 and TGFBeta withdrawal there was only a very modest induction of differentiation-associated genes in two clusters, most of them of the ectodermal lineage (Fig. 6b). All cells still expressed high levels of pluripotency-associated markers, with none showing expression of canonical early differentiation markers such as TBXT, NES, or PAX6, while some cells did express ectodermal markers such as OTX2 and MAP2 (Fig. 6c). The location of the cells by day of differentiation and by karyotype are shown in Figs. 6d and e, and the composition of each cluster is plotted in Fig. 6f. Overall, one cluster was composed mostly of undifferentiated cells, both wt alone and the co-cultured cells, and another cluster of the undifferentiated trisomy 20 cells. The undifferentiated 1q cells resided in both clusters 1 and 2. The two clusters that induced predominantly ectoderm-associated genes were composed exclusively of day 1 and day 2 cells and were considered the standard route towards neuroectodermal differentiation. Two other clusters were composed of cells from the three days and had expression profiles that were predominantly of undifferentiated cells and with induction of genes that could not be associated to any specific lineage. The 1q cells, when cultured alone, remained in clusters 1 and 2 over the two days after growth factor withdrawal, and trisomy 20 cells transitioned from their undifferentiated cluster to cluster 1. This was in strong contrast to the trajectory of the wt cells, that transitioned from the undifferentiated cell cluster, and to day 1 and then day 2 of ectoderm induction. Remarkably, the 1q cells, when co-cultured, had one part of the cells following the same ectodermal trajectory as wt cells, and the other part transitioning to cluster 2 (Fig. 6g). These results prove that the change in differentiation trajectory induced by the co-culture of 1q cells with wt cells is occurring very early in differentiation, and further supports the findings that the differentiation of other mutant cells (in this case trisomy 20) are not rescued by the co-culture.
Finally, we analyzed ligand-receptor expression patterns in the two scRNAseq data sets. Figure 6h shows the cell-to-cell communication between wt cells and 1q and T20 cells, from the undifferentiated state to day 2, as well as between wt and 1q cells in the purified RPE state (Fig. 6i, full lists can be found in the Supplementary Datafile 4 and 5). We filtered for ligand-receptor pairs in which the ligands are provided exclusively by the wt cells, and the receptors are located on the 1q cells, as we hypothesized that wt cells secrete ligands that the 1q cells do not, and that are key to the differentiation process. From this analysis, two candidate networks emerge as the most interesting candidates: MFGE8 and collagens secreted by the wt cells, binding to the integrin receptors of the 1q cells, and the interaction between NLGN1, predominantly secreted by the wt, and NRXN1 and NRXN3 in the 1q cells. Figure 6j illustrates the progression of the expression of these genes in 1q cells differentiating alone or in co-culture, as well as of wt cells, and show how co-cultured 1q cells closely follow the expression pattern of wt cells, in contrast as when differentiating alone. Figure 6i shows two examples of the lower expression of collagen genes in RPE1q as compared to RPEwt. MFGE8 is known to regulate EMT in different cell types, which in turn is a key process in hPSC differentiation52,53, and collagen is essential for the formation of an extracellular matrix promoting correct RPE differentiation54,55. This is in line with the observation that both the EMT and the collagen biosynthesis appeared negatively enriched in the bulk RNA sequencing of the RPE (Fig. 5d, e). NLGN1, NRXN1 and NRXN3 are genes known to be involved in late neuroectodermal development 56, but in our dataset also appear to be key regulators of the early spontaneous differentiation process. NRXN1 is for instance one of the top induced genes in the ectodermal clusters of day 1 and day 2 of differentiation (Fig. 6b). Together, this suggests these gene networks may be part of the mechanisms behind the impairment of 1q differentiation, as well as the rescue by the wt cells.