ESC-derived primNSCs show tumorigenicity
The ‘neural default state’ of ESCs prompted us to investigate whether ESC-derived NSCs exhibit tumorigenicity. As reported, mouse ESCs (mESCs) differentiated into neuroectodermal cells, the primNSCs [24, 25, 28] after six days of culture in serum-free medium, which formed floating neurospheres (Fig. 1A). Immunofluorescence (IF) did not reveal significant expression of ESC markers Myc and Oct4 (Fig. 1B), whereas NSC markers Msi1, Sox1 and Pax3 were detected in neurospheres (Fig. 1B), confirming the gain of NSC fate in mESCs. We tested whether primNSCs had also tumorigenicity. When each 1×106 cells were injected, both mESCs and primNSCs formed tumor in all injected nude mice (Additional file 1: Fig. S1A and Table S1), but tumor growth rate and weight of mESCs was lower than that of primNSCs (Additional file 1: Fig. S1B and C). To confirm whether primNSCs have stronger tumor-initiating capacity, a series of number of cells were injected. Indeed, much fewer primNSCs than ESCs were required for tumor initiation (Additional file 1: Table S2). The results suggest that tumor formation by primNSCs should not be due to the effect of remaining undifferentiated mESCs, and primNSCs are more tumorigenic than mESCs. PrimNSC-derived tumors (Fig. 1C) showed a wide spectrum of phenotypically identifiable tissue types (Fig. 1D), e.g. the neural tissue and keratinized structures derived from ectoderm (Fig. 1D), the endodermal tissues including glandular and gut-like epithelia (Fig. 1D), and mesodermal tissues such as adipocytes, cartilage and muscle (Fig. 1D). Quantitative RT-PCR (RT-qPCR) revealed that genes representing mesodermal (T, Kdr), endodermal (Foxa2, Gata4/6, Sox17) and ectodermal (Fgf5, Sox11 for primitive ectoderm) differentiation were generally activated in tumor (Fig. 1E). Neuronal differentiation also occurred, as shown by activation of neuronal genes in tumor (Fig. 1F) and the presence of neuropil structure adjacent to immature neuroepithelium (Fig. 1D). Besides, a series of genes representing neural stemness were significantly upregulated in tumor compared with primNSCs (Fig. 1G). Neural stemness genes, which are responsible for self-renewal and differentiation of NSCs/NPCs, play cancer-promoting roles or are upregulated during tumorigenesis [10, 12]. Upregulation of these genes suggests a continuing promotion of tumor growth. This fashion of tissue differentiation in primNSC-derived tumor is reminiscent of ESC-derived teratoma/teratocarcinoma. Hence, primNSCs have tumorigenicity and pluripotent differentiation potential.
Embryonic NSCs/NPCs also show tumorigenicity and differentiation potential for non-neural tissues
We tested whether NSCs/NPCs at later stages of embryonic neural development are still tumorigenic and display differentiation potential. NE-4C is a NSC cell line derived from cerebral vesicles of mouse E9 embryos. The cells injected subcutaneously also formed tumors in nude mice (Fig. 2A; Additional file 1: Table S1). Moreover, injection of the cells via tail vein caused tumor formation in different areas of animal body, e.g. abdominal cavity, legs, etc. (Additional file 1: Fig. S2A and B), suggesting that the cells have both tumorigenic and metastatic potentials, and the tumorigenicity is not restricted to a specific environment. Histological examination showed that the tumors contained large areas of blue-staining, immature neuroepithelium-like cells, and a variety of other tissue types (Fig. 2B). There were differentiated nerve tissue (neuropil), mesodermal derivatives such as blood vessels, osteoid tissue, as well as endodermal derivatives including gut-like epithelium (Fig. 2B). Compared with NE-4C cells, the tumor showed a general tendency of upregulation of a series of genes representing neural stemness (Fig. 2C), genes representing neuronal differentiation (Fig. 2D), and genes representing mesodermal (Acta2, T, Desmin, Kdr) and endodermal tissue (Afp, Foxa2) differentiation (Fig. 2E). Immunohistochemistry (IHC) demonstrated strong expression of neural stemness markers Pax6, Sox1 and Sox9 in cells characteristic of undifferentiated neuroepithelial cells, and the neuronal marker Map2 in areas with obvious features of differentiated neural tissue (Fig. 2F). The tumor also expressed the osteoblast marker Bglap and the osteoclast markers Acp5 and Ctsk (Fig. 2F). Ctsk is a marker for macrophage differentiation, too. The endodermal marker Afp was detected significantly (Fig. 2F). Although tissue types of all germ layers could be observed, tissue differentiation and diversity appeared not as abundant as in tumors generated from primNSCs. These results demonstrate that NSCs derived from early brain vesicle have differentiation potential for non-neural lineages. Besides the general roles of neural stemness genes in promoting cancer, expression of Acta2, Desmin or Kdr is frequently observed during cancer progression, and Afp is used as a marker for cancers of the liver, testicles and ovaries. Therefore, this fashion of tissue differentiation and gene expression change is similar to what occurs in cancers.
Next we examined the NPCs from mouse embryonic cortices at E13.5, at which stage the NPCs are more differentiated than those at E9. Disaggregated cortical cells formed neurospheres in serum-free medium, which expressed neural stemness markers Sox1 and Pax3 (Fig. 3A). These cells formed tumors in nude mice (Fig. 3B; Additional file 1: Table S1). Tumor growth of cortical NPCs was much slower than that of NE-4C cells (Fig. 3C), even the injected cell number of the former cells was more than the latter (Additional file 1: Table S1). The tumor was dominantly composed of eosinophilic tissues, in particular tissues that contain large bundles of collagen fibers (Fig. 3D). In this background, other tissues could be identified, including blood vessels, undifferentiated neuroepithelial cells, and differentiated nerve tissue (Fig. 3D). A piece of osseous tissue, which was adjacent to muscle tissue, was present in tumor (Fig. 3D). The tumor displayed a strong activation of genes for mesodermal (Desmin, Myog, Myh1 and Myh3) and endodermal (Afp, Krt20) tissue differentiation (Fig. 3E). Transcription upregulation was also observed for neural stemness genes, Pdgfra, Stat3 and Vim (Fig. 3F). Nevertheless, genes for neuronal differentiation (Map2, Neun, Robo2, Tubb3) and more genes for neural stemness (Ascl1, Cdh2, Msi1, Nestin, Notch1, Pax5/6, Sox1, Sox2, Sox9, Zic2, Zeb2) were reduced (Fig. 3F). In tumor sections, expression of Sox1, Sox9, and Map2 was obvious in undifferentiated cells and differentiated neural tissue (Fig. 3G). The marker for smooth muscle Acta2 and markers for osteoblasts and osteoclasts Bglap, Acp5 and Ctsk, were detected in restricted areas or scattered cells (Fig. 3G). These data show that NPCs, which are undergoing neuronal differentiation at E13.5 during normal neural development, differentiate into different tissues in a non-native environment. The general tendency of decreased expression in neural stemness genes was correlated the weak capability of tumor growth. Thus, like primNSCs and NE-4C cells, NPCs from developing brains also exhibit differentiation potential for non-neural tissues.
As a control, MEFs isolated from E13.5 embryos did not form tumors in nude mice, even in an extended period after injection (Additional file 1: Table S1). Tumorigenicity was not observed for MSCs after 114 days post injection (Additional file 1: Fig. S3 and Table S1). Myoblasts did not cause tumor formation, either (see text below). Therefore, tumorigenicity and differentiation potential are properties of NSCs/NPCs.
Cancer cells display the ability of neurosphere formation and differentiation potential, similar to NSCs/NPCs
Free-floating neurosphere formation in NSC serum-free medium is a feature of NSCs, as shown by primNSCs (Fig. 1A). NE-4C cells grew in a monolayer when cultured in its regular medium. The cells also formed neurospheres in serum-free medium (Fig. 4A). However, MEFs did not form any spherical structures, either in their regular medium or in serum-free medium (Fig. 4A). Cancer cells, such as melanoma cell line A375, colorectal carcinoma cell line HCT 116 or glioblastoma cell line U-118 MG, grew in monolayers in their respective regular media. In serum-free medium, nevertheless, A375 and HCT 116 formed free-floating neurosphere-like structures (Fig. 4A). U-118 MG also formed a spherical structure, but not free-floating (Fig. 4A). IF showed the expression of NSC marker Sox1 in NE-4C neurospheres but not in MEFs (Fig. 4B). This marker was detected in the spherical structures formed by cancer cells (Fig. 4B), supporting the similarity between cancer cells and NSCs/NPCs. Many types of tissue stem/progenitor cells form spheres in serum-free media. The media are usually cell-specific and not exchangeable. For example, rat bone marrow mesenchymal stem cells (MSCs) form spheres in specific medium, but could not in serum-free medium used in current study (Fig. 4C). Therefore, sphere formation in NSC serum-free medium indicates neural stemness.
We investigated whether these cancer cells could show differentiation potential. These cells formed tumors in nude mice (Additional file 1: Table S1). Compared with A375 cells, A375 xenograft tumor displayed an upregulation for glial and neuronal genes (GFAP, BDNF, MAP2, MTURN, ROBO2, SYT1 and TUBB3) and neural stemness genes (ASCL1, MSI1, PAX6, PDGFRA, SOX9, VIM, ZIC1/2) (Fig. 5A). Pluripotency genes MYC, OCT4 and SOX2 were also upregulated (Fig. 5A). Actually, these genes are neural specific during embryonic development in Xenopus and mouse [30] besides their earlier expression in blastocysts. Moreover, genes representing mesodermal or/and endodermal tissue differentiation (ACP5, ACTA2, BGLAP, CTSK, DESMIN, FOXA2) were activated in tumor (Fig. 5B). IHC also identified strong expression of neural stemness markers SOX1 and PAX6 in cells rich in nuclei, similar to immature neuroepithelial cells (Fig. 5C). Neuronal marker MAP2 was detected among neuroepithelial-like cells (Fig. 5C). Intense staining was also observed for ACTA2, BGLAP and ACP5 (Fig. 5C). The endodermal marker AFP was detected in scattered cells (Fig. 5C).
Xenograft tumor of HCT 116 showed an increased expression of neuronal genes MAP2 and SYN1, neural stemness genes CDH2, MSI1, NCAM1, SOX2/9, VIM and ZEB2 (Fig. 6A), and genes for mesodermal and endodermal tissues (ACP5, AFP, DESMIN, HNF4A and KRT8) (Fig. 6B). IHC showed more detailed information on tissue types in the tumor. Large areas of intense staining of SOX1, SOX9 and MAP2 (Fig. 6C) suggested that cells with neural stemness and neuronal features were among the major cell types in tumor. Strong expression of BGLAP was present. Besides, KRT5 and AFP expression was seen in many cells (Fig. 6C). Gene expression comparison between U-118 MG cells and the tumor showed an increased expression of neuronal and neural stemness genes MAP2, NEUN, ROBO2, PAX6, ZIC1, ZEB2, as well as MYC and OCT4 in tumor (Fig. 6D). Likewise, Gene expression for mesodermal and endodermal tissues ACTA2, BGLAP, CTSK, DESMIN and KRT8/20 was enhanced (Fig. 6E). In agreement, expression of PAX6, SOX1, SOX9, MAP2, BGLAP, CTSK, ACTA2, which represent undifferentiated and differentiated neural cells, and different mesodermal cell types, was observed with IHC (Fig. 6F). We also stained the tumor sections with an antibody specific for human cells. Staining signal was universal in sections of tumors of cancer cells (Additional file 1: Fig. S4A), but not present in the NE-4C xenograft tumor section (Additional file 1: Fig. S4B). Considering together that transcription was detected specifically for human genes, different cell or tissue types detected in tumors were derived from injected cancer cells, but not from host mouse tissues. Hence, these cancer cells display the property of NSCs/NPCs in their capacity of neurosphere formation and differentiation potential.
Different cancer cells showed a difference in their ability of neurosphere formation. Among lung cancer cell NCI-H460, colorectal cancer cell SW480, osteosarcoma cell U-2OS and hepatocellular carcinoma cell HepG2, NCI-H460 formed neurospheres, but other cells didn’t in serum-free medium (Fig. 7A). SW480 showed cell morphological change. U-2OS and HepG2 did not even display change in cell morphology, with only an increase in cell density (Fig. 7A). This difference in neurosphere formation indicates different degrees of neural stemness in these cells. When the same numbers of cells were injected into nude mice, NCI-H460 and SW480 formed tumors (Fig. 7B; Additional file 1: Table S1). Nevertheless, tumors of NCI-H460 grew faster and larger than those of SW480. U-2OS and HepG2 did not form tumors in this assay (Fig. 7B-D). The results might suggest that the degree of neural stemness in cancer cells is proportional to their capacity of tumor initiation. Cancer cells without tumor formation could mean that they are not sufficient for tumor initiation but still possess malignant features like fast proliferation, clonogenicity, etc., or possibly exhibit tumorigenicity under a more sensitive condition, such as in more severely immunodeficient mice.
Myod1 knockout in myoblast cells leads to a neural phenotype and confers tumorigenicity in cells
Now that NSCs/NPCs display tumorigenicity and can act as an initial state for non-neural tissue differentiation, we explored whether the process could be reversed in somatic cells. C2C12 is a mouse myoblast cell line used for investigating muscle differentiation. We employed CRISPR/Cas9 to knock out Myod1, which encodes a transcription factor critical for myogenesis, in C2C12 cells. After lentiviral transfection of the vector containing the sgRNA, which corresponds 338-357 bp in the first exon of Myod1 gene (Additional file 1: Fig. S5A), cells were selected with puromycin. After serial passaging and genotyping, we identified a clone showing homozygous deletion for Myod1 gene. PCR with a pair of primers F0/R0 generated a correct product (550 bp) in wild type gene (Additional file 1: Fig. S5A and B), but generated a shorter product in the selected clone (Additional file 1: Fig. S5B). A pair of nested primers that should generate a product of 323 bp in wild-type cells (Additional file 1: Fig. S5A) produced no product from the genomic DNA of the selected clone (Additional file 1: Fig. S5C). Sequencing showed a 209 bp deletion in Myod1 gene, ranging from 134-342 bp downstream the start codon, with the sgRNA target sequence locating at the beginning of the missing sequence (Additional file 1: Fig. S5D). This loss caused a frameshift and a premature translational termination, leading to a conceptual translation of a peptide of 56 aa, with the first 45 aa being homologous to Myod1 (Additional file 1: Fig. S5E). This region contained no functional domains or motifs, suggesting that the peptide is nonfunctional. RT-PCR detection of transcription with primer pair F0/R0 revealed normal expression of Myod1 in wild-type cells. Whereas no significant transcription was detected in knockout cells (Additional file 1: Fig. S5F). Correspondingly, Myod1 was detected in nuclei of wild-type cells but not in knockout cells (Additional file 1: Fig. S5G). These confirmed a successful knockout in C2C12 cells. The wild-type cells were termed hereafter as C2C12WT, and knockout cells were as C2C12Myod1-/-. Knockout caused cell morphological change. C2C12Myod1-/- cells grew long processes, which were not present in C2C12WT (Additional file 1: Fig. S5H).
C2C12WT cells underwent muscle differentiation, as shown by myotube formation in medium with low concentration of serum, typically 2% horse serum; whereas C2C12Myod1-/- cells did not (Fig. 8A). The myotubes displayed staining of muscle-specific markers, Myoglobin and Myosin heavy chain (Mhc), which were absent in C2C12Myod1-/- cells (Fig. 8B). Therefore, loss of Myod1 led to the loss of muscle differentiation potential. The cells exhibited the ability of muscle differentiation again after introduction of a plasmid coding for MYOD1-eGFP fusion protein, as shown by expression of Mhc and Myoglobin (Additional file 1: Fig. S6A), suggesting a successful rescue by compensating the MYOD1 DNA in KO cells. When cultured in serum-free medium, C2C12WT cells still formed finer myotube-like structures (Fig. 8C). Intriguingly, C2C12Myod1-/- cells formed floating spherical structures, resembling neurospheres (Fig. 8C). IF detection of Mhc confirmed that C2C12WT cells indeed formed myotubes (Fig. 8D). But Mhc was not present in the spherical structures (Fig. 8D). Sox1 was not detected in myotube-like structures, but present in spherical structures of C2C12Myod1-/- cells (Fig. 8D), an indication of NSCs/NPCs. Immunoblotting (IB) revealed an upregulation of additional neural stemness markers, Fgfr1, Hes1, Msi1, and Sox2/9, the cell cycle protein Pcna and the neural crest regulator Myc in C2C12Myod1-/- cells (Fig. 8E). A transcriptome assay showed extensive transcriptional reprogramming after Myod1 knockout (Fig. 8F), with 1801 genes being downregulated and 1400 genes upregulated (Additional file 8: Table S10). Gene ontology (GO) enrichment analysis indicated that the differentially expressed genes (DEGs) were mostly associated with muscle structure development, followed by other GO terms that are associated with muscle cells and their physiological functions (Fig. 8G). This agrees with the phenotypic change and marker expression assays. Moreover, the largest number of DEGs are neural related genes (sensory plus nervous system) according to KEGG gene classification (Fig. 8H). Interestingly, the human disease corresponding with the biggest number of DEGs is cancer (cancers in general plus cancers of specific types) (Fig. 8H), suggesting a correlation between the neural phenotype and tumorigenicity in knockout cells.
Treatment with retinoic acid (RA), a reagent inducing neuronal differentiation of NSCs/NPCs, did not alter the myotube phenotype (Additional file 1: Fig. S6B), but the disaggregated neurosphere cells underwent neuron-like differentiation (Additional file 1: Fig. S6B). Mhc was detected in myotubes but not in cells with neuronal morphology (Additional file 1: Fig. S6C). In contrast, neuronal markers Map2 and Synapsin-1 were not detected in RA-treated myotubes, but detected in RA-treated C2C12Myod1-/- cells (Additional file 1: Fig. S6D and E). The results confirmed neuronal differentiation potential of C2C12Myod1-/- cells.
Next we explored whether the gain of NSC/NPC phenotype would mean the gain of tumorigenicity in C2C12Myod1-/- cells. In soft agar, C2C12WT cells did not form colonies, whereas C2C12Myod1-/- cells did (Fig. 9A and B). Moreover, C2C12Myod1-/- cells showed stronger abilities in invasion and migration than C2C12WT cells (Fig. 9C). These data indicated the gain of malignant features in C2C12Myod1-/- cells. Importantly, C2C12Myod1-/- cells formed tumor in nude mice, but C2C12WT cells did not (Fig. 9D-F; Additional file 1: Table S1). We examined whether there was cell/tissue differentiation in the tumors. RT-qPCR showed a higher expression level of neuronal genes (Map2, Mturn, Neun, Robo2, Synj1) and neural stemness genes (Ascl1, Msi1, Pdgfra, Stat3, Zic2, Zeb2, Sox2, Notch1) in tumors than in C2C12Myod1-/- cells (Fig. 9G). Moreover, genes representing endodermal and epidermal tissue differentiation Krt8, Krt20, Sox17 and Cdh1, and genes for mesodermal differentiation Desmin, Kdr, Myog, Myh1/3/4 were upregulated in tumors (Fig. 9H). IHC revealed expression of Sox1, Sox9, Map2, Acta2, Bglap and Afp in sections, indicating the presence various types of neural and non-neural tissues or cells in tumor (Fig. 9I). There was spindle-cell RMS-like tissue that contained elongated spindle-shaped cell nuclei, for example, the cells in HE-stained sections corresponding with Map2 or Acta2 expressoin (Fig. 9I). This feature is characteristic of spindle cell rhabdomyosarcomas (RMS), which show recurrent MYOD1 p.L122R mutation [31, 32]. This mutation blocks the wild type MYOD1 function [31], causing a loss of function effect that mimics Myod1 knockout. Actually, MYOD1 was reported as a tumor suppressor. In summary, loss of Myod1 in myoblast cells leads to the loss of their identity, gain of the properties of NSCs/NPCs, tumorigenicity and the potential for re-differentiation.
Neural stemness is required for tumorigenicity
The evidence above suggests that the property of neural stemness contributes to cell tumorigenicity. We next tested whether tumorigenic cells changed their tumorigenicity when their neural stemness was reduced. NE-4C cells treated with RA at 1 mM for 72 hrs showed strongly a neuronal phenotype (Fig. 10A). The treatment caused a significant repression of expression of neural stemness proteins or proteins enriched in embryonic neural cells, Msi1, Sox1/2/9, Hes1, Myc and Fgfr1, which all play promoting roles during tumorigenesis, and upregulation of neuronal proteins Syn1 and Map2 (Fig. 10B), indicating neuronal differentiation and inhibition of neural stemness in NE-4C cells. The differentiated cells showed a dramatic decrease in the ability of colony formation in soft agar (Fig. 10C, D) and compromised tumorigenicity in nude mice (Fig. 10E-G; Additional file 1: Table S1). Similar experiments were carried out for C2C12Myod1-/- cells. The cells assumed a neuronal phenotype after treatment with RA (Fig. 10I), and showed a decreased expression of Msi1, Sox1/2/9, Hes1, Myc and Fgfr1 and an increased expression of Syn1 and Map2 (Fig. 10J). This means that neuronal differentiation occurred in the cells accompanied with the reduction of neural stemness property. Likewise, the treated cells displayed decreased ability in colony formation (Fig. 10K, L) and could not form tumors in nude mice, in contrast to vehicle-treated cells (Fig. 10M-O; Additional file 1: Table S1). In addition, we also examined the effect of re-introduction of MYOD1 on tumorigenicity of the C2C12Myod1-/- cells. As shown in Additional file 1: Fig. S6A, compensation of MYOD1 successfully rescued the KO cells, leading to a muscle cell phenotype in the cells. The rescued cells demonstrated also a reduced capability in colony formation (Additional file 1: Fig. S7B and C), with no tumor formation by the rescued cells, in contrast to the control KO cells (Additional file 1: Fig. S7E-G and Table S1). These results show that loss of neural stemness via differentiation causes the repression of cell tumorigenicity, indicating that neural stemness is required for tumorigenicity
Conservation of neural genes and the their association with cancer
Since neural stemness and tumorigenicity are cell properties that are difficult to be interpreted by individual molecular events, e.g. a gene or signaling pathway, we analyzed neural and non-neural genes in general and their association with these cell properties, and explained why neural stemness can contribute to tumorigenicity and differentiation potential. During evolution, ectoderm originated the earliest, followed by endoderm and mesoderm. Moreover, the founders of neural genes increased abruptly first in the last common ancestor of eukaryotes and then at the time of emergence of eumetazoa [33]. We asked whether there was a bias for neural gene evolution during the phase from the unicellular ancestor to metazoan. Three organisms were used for this analysis: M. brevicollis, A. queenslandica and T. adhaerens. M. brevicollis is a species of choanoflagellate protists, the closest single-celled relatives of metazoans [34]. A. queenslandica represents the oldest surviving metazoan [35] and an evolutionary intermediary between unicellular choanoflagellate protists and eumetazoans, and T. adhaerens is the basal eumetazoan species [35, 36]. These organisms shared a last common unicellular ancestor in more than 600 million years ago [34]. By searching tissue expression patterns of genes in typical vertebrate animal models, and comparing protein sequence homology between mammals and these three lower species (Fig. 11A), we found that, among a total of 5283 genes with neural specific expression or at least expression enriched in neural tissues (collectively termed as ‘neural genes’) in Xenopus, zebrafish, mouse or human (Additional file 2: Table S4), 3608 genes (68.3%) were found for their founder genes in either one of the three organisms (Fig. 11B; Additional file 2: Table S4). 2088 (39.5%) of neural genes are present in M. brevicollis (Fig. 11B; Additional file 2: Table S4). Therefore, most genes for neural cell formation originated already from the starting stage of multicellularity. We also analyzed 738 genes that function in human neuron according to Gene Ontology, but are expressed not specifically or not enriched in neural tissues (Collectively termed as ‘non-specific genes’) (Additional file 3: Table S5). Only 235 (31.8%) genes are present in either one of the three organisms, and 86 (11.6%) genes in M. brevicollis. No significant homologs for the remaining genes (68.2%) were identified in these organisms (Fig. 11B; Additional file 3: Table S5). Thus, the genes not specific for neural cells emerged later than neural specific genes.
Neural genes are mostly involved in biological processes for neural development/differentiation and associated with cellular components of neural cells (Fig. 11C). They are involved principally in signal transduction, axon guidance, developmental biology and pathways in cancers (Fig. 11D). These genes are mostly involved, of course, in nervous system diseases, followed by congenital disorders and then by cancers in general and cancers of specific tissues/organs (Fig. 11E). These analyses support an intrinsic link between neural state, embryonic development and cancer.
The 738 non-specific genes are primarily associated with neural development and neuron parts according to GO (Fig. 11F) and with signal transduction, neuronal system, axon guidance and developmental biology according pathway terms (Fig. 11G). They are linked with nervous system diseases and other non-cancer diseases (Fig. 11H). As an additional comparison, we analyzed 7238 genes that are expressed only in non-neural tissues or at least not mainly in neural tissues, or for which the expression patterns are not clear (Additional file 6: Table S8) (Collectively termed as ‘non-neural genes’). The GO and pathway terms enriched for these genes are entirely different from those for neural genes (Additional file 1: Fig. S8A and B). These genes are associated predominantly with non-neural, non-cancer and non-congenital diseases (Additional file 1: Fig. S8C).
Neuronal genes have over representation of long genes [37, 38]. Long genes are also enriched in cancer pathways [29]. We thus asked whether there was a length bias in the three sets of genes above. Brief calculation showed an average length of 92765 nucleotides (nt) for 11270 transcript variants of 5283 neural genes, an average length of 90154 nt for 1688 transcripts of 738 genes with both neural and non-neural expression, and 42911 nt for 12874 transcripts of 7238 non-neural genes (Additional file 1: Fig. S9; Additional file 7: Table S9). Additionally, neural genes have more exons/introns than non-neural genes (Additional file 7: Table S9). Therefore, genes involved in neural development are overall much longer and more complex in gene structure than genes in non-neural cells. In summary, most genes for neural development and differentiation had emerged in the closest species representing the evolution from unicellular organisms to metazoan. Neural genes are tightly connected with cancer and rich in longer genes, whereas non-neural genes are the opposite.
A neural bias of genes in M. brevicollis
Many neural genes present in M. brevicollis let us to analyze whether M. brevicollis genes homologous to mammals show expression bias in vertebrates. Among 9275 proteins (Additional file 5: Table S7) encoded by about 9200 genes [34] with Xenopus, mouse, or/and human proteins (Fig. 12A), 4700 (50.7%) protein sequences in M. brevicollis share significant similarity with those in mouse or/and human, indicating a close relationship between M. brevicollis and higher animals (Fig. 12B; Additional file 5: Table S7). Among the 4700 mammalian homologous proteins, the mammalian homologous genes for 2985 proteins, accounting for 63.5%, are neural specific or at least enriched in neural tissues during embryogenesis, whereas spatial expression patterns of the remaining homologous genes are not clear or they are not predominantly expressed in neural tissues (Fig. 12B; Additional file 5: Table S7). Hence, the founder genes in the unicellular organism closest to metazoan might have been biased towards a neural state. This suggests that multicellularity originated from a unicellular neural-biased state.
The 4700 mammalian homologous genes are mainly associated with catalytic activity, subcellular structures and nucleoside/nucleotide/small molecule binding (Fig. 12C). Correspondingly, these genes are correlated with metabolism/metabolic pathways, gene expression, rRNA processing, etc. (Fig. 12D). Logically, these genes are not related with features typical of multicellular animals. For diseases, they are associated with congenital disorders of metabolism and its subcategories (Fig. 12E), in agreement with their functional link with metabolism. The genes are also connected with congenital disorders of development and nervous system diseases. Thus, these genes are mostly related with regulation of cell metabolism, embryonic development and nervous system. There are seven pathway terms (as indicated with circles in Fig. 12D) for M. brevicollis genes that are shared with those for neural genes (Fig. 11D). But almost no terms are in common with those for non-specific genes (Fig. 11G) and non-neural genes (Additional file 1: Fig. S8B), an additional indication that mammalian homologous genes in M. brevicollis are related to neural genes but not the genes for other cell types.
The 2985 mammalian homologous genes with specifically or dominantly neural expression are also mainly associated with subcellular structures, nucleoside/nucleotide/RNA/small molecule binding and catalytic activity (Fig. 12F), and with metabolism, gene expression, rRNA processing and protein translation /metabolism, cell cycle, and axon guidance, etc. (Fig. 12G). This subset of neural genes is mostly associated with various congenital disorders, especially the congenital disorders of metabolism, followed by nervous system diseases and cancers (Fig. 12H), implying an involvement of these genes in development, neural development and cancer. The mammalian homologous genes in M. brevicollis are mainly correlated with metabolism, indicating a close relationship in regulatory network for metabolism between higher animals and their unicellular ancestor.