In this study, we found a region significantly associated with CSW through artificial selection analysis and association analysis. Through sequence and expression profile analyses, we found that genes within the region appeared to function together. The expression levels of BMgn002067 and BMgn002071 were significantly different between high-yield and low-yield strains, while the exon regions of BMgn002066 and BMgn002073 were mutated, leading to changes in the domain.
Among the genes with domain mutations, BMgn002066 encodes adenylate and guanylate cyclase, an important signal transduction enzyme. After activation, it can not only activate the NO-SGC-cGMP signaling pathway but also inhibit the TGF-β signaling pathway [20-23]. cGMP is an important second messenger in vivo and can regulate downstream effector molecules such as protein kinase G. cGMP-dependent phosphodiesterase and cGMP-gated ion channels participate in a series of physiological or pathological reactions, including vasodilatation of blood vessels, inhibition of platelet aggregation, inhibition of cell proliferation, and other physiological regulation processes [21]. The inhibition of TGF-β signaling pathway can induce physiological effects on the inhibition of tissue fibrosis and cell proliferation. In the research on rectal and thyroid cancer, it has been found that adenylate and guanylate cyclase plays an important role in cancer proliferation [21]. As the silk gland undergoes cell division during the embryonic period, the number of cells is significantly related to the silk yield, and thus, we speculate that BMgn002066 affects the development of the silk gland by affecting cell proliferation. BMgn002073 encodes an SLC4-like anion exchanger. SLC4 gene products play an important role in CO(2) transport in red blood cells, in H(+) and HCO(3)(-) absorption or secretion by various epithelial cells, and in regulating cell volume and intracellular pH [24, 25, 26]. Studies have shown that AE2 of the SLC4 family can regulate cell volume increase by Cl(-) uptake [26]. In the silkworm, enlargement of silk gland cells occurs during the larval stage after the completion of silk gland cell division in the embryo. Combined with the function of SLC4 to regulate the cell volume, we speculate that this enzyme may affect the silk gland development by regulating the cell volume.
The genes BMgn002067 and BMgn002071 encode α tubulin-N-acetyltransferase 1 (ATAT1) and 5-formyltetrahydrofolate cyclase (MTHFs), respectively, and there were significant differences in the expression level between high- and low-yield strains. ATAT1 is used to catalyze the acetylation of tubulin and is clustered in the nucleus during the G1-G2 phase. At late mitosis, ATAT1 co-locates with chromatids and spindles and eventually migrates to daughter nuclei, newly synthesized centrioles, and midbodies [27]. The specific distribution of ATAT1 in the cell cycle suggests that ATAT1 has multiple functions, including microtubule acetylation, RNA transcription activity, microtubule cut-off, and cytokinesis completion [28-33]. In the study of the function of ATAT1 in cell division, researchers found that knocking out ATAT1 in Tetrahymena slowed its growth rate [34]. In the silkworm, silk gland development is closely related to cell division. Investigations have shown that the number of silk gland cells in high-yield strains is higher than that in low-yield strains. The number of silk gland cells does not change after the completion of silk gland cell division in the embryonic stage. ATAT1 functional research on the regulation of cell division indicates that this gene may regulate silk gland development by affecting embryonic mitosis. MTHFs are important enzymes in the pathway of folate metabolism, being involved in the metabolism of folate, purines, vitamins, and coenzyme factors. MTHFs regulate carbon flow through folate-dependent single-carbon metabolic networks [35, 36]. This regulatory network provides carbon for the biosynthesis of purines, thymine, and amino acids and influences DNA synthesis in vivo, thus influencing cell division and maturation [37, 38, 39]. MTHFs have been demonstrated to inhibit cell growth in human MCF-7 breast cancer cells; it has been confirmed in mice that MTHFs are an essential part of the purinosome and provide conditions for cancer cells to rapidly synthesize purine nucleotides [40, 41]. These studies of MTHFs show that they are closely related to cell division. In the silkworm, silk gland cell proliferation and growth directly affect silk gland development. Combined with the function of MTHFs in cell growth and division in other species, it can be inferred that genes that regulate silk yield are very likely to participate in energy metabolism and DNA synthesis. Therefore, it is possible that MTHFs can influence the energy supply of silk gland development, provide carbon for DNA synthesis, and affect the division and maturation of silk gland cells, thus affecting the CSW.
In this study, multiple genes were shown to coordinate and regulate the same trait within a certain region. Generally, there are two types of gene clustering distributions. The first type is a gene family in the form of a cluster in a region; this is the most common. The second type is where genes with similar functions are clustered together, although they are not necessarily in the same gene family [42]. Clustering distribution of genes has been studied in other species, for example, the conserved supergene loci affecting butterfly diversity [43], the supergene loci affecting butterfly mimicry [44], and a group of supergenes influencing the shape of male water birds' neck hair [45]. Studies have shown that gene clustering is an effective gene regulation mode in biological evolution. Although some clustered genes belong to different gene families, they always show similar expression profiles. This may be due to the initiation of the regulation of gene expression or the structural changes of chromosomes. There is more than one gene in the same open box that can be activated and transcribed at the same time, thereby greatly improving the efficiency of transcription. There have also been studies on silkworms through mixed pool sequencing analysis; these have found that most of the genes related to silk gland development in the silkworm are clustered in the genome and have similar expression profiles [46]. The selected genes ATAT1, MTHFs, Adenylate and Guanylate cyclase, and SLC4-like anion exchanger appear in clusters along with the silk yield regulating gene BmAbl1 detected in previous studies, and the expression profiles are significantly similar. Based on this, we speculate that they may share the same enhancer to regulate the open reading frame, and the efficiency of multi-gene action can be greatly improved by regulating the expression of gene clusters at the same time. The simultaneous accumulation of gene effects can jointly regulate cell division, protein synthesis, and energy supply in order to regulate silk yield traits more efficiently. This finding implies that it is not only morphological traits but also a similar mode of regulation in quantitative traits that will provide a new reference for the study of quantitative traits.
The results of the domestication analysis in this study were very interesting. In the selection pressure analysis, we found that the location of the gene cluster received stronger selection during the domestication process than during the breeding process. This was indicated since from wild to local strains, CSW only increased from 0.06 g to about 0.15 g, while in the process of breeding, the CSW of improved strains reached up to 0.4 g [12]. The silk yield has been greatly improved during the breeding process, and the researchers therefore speculated that the breeding process may have entailed greater selection for silk yield-related genes. In addition, the results of a principal component analysis (PCA) of the silk gland transcriptome in a previous study showed that most of the differentially expressed genes had little difference in principal components between wild and local strains, while the difference between local and improved strains was greater [10]. The PCA results confirmed the above speculation. However, in this study, the loci of the gene cluster were more strongly selected during the domestication process. In response to the completely different results of this study, we found the same situation in a cotton yield study by consulting the literature. The genetic divergence degree of cotton from local strains to improved strains was only 0.04, while the Fst of wild and local strains was as high as 0.10. This suggests that the process of breeding has had little effect on the genetic diversity of cotton, while the process of domestication greatly reduced the genetic diversity of cotton [47]. From this result, we hypothesized that our result might be due to the fact that some important yield regulatory genes were selected and fixed in advance during the domestication process to ensure the base yield.
This study has identified the first gene cluster region mapped by the forward genetic method, providing evidence for the regulation of gene clusters in silk yield. However, the regulation mode of gene clusters in the region and the mechanism of how each gene regulates silk yield need to be studied further in future.