3.1. Construction of weighted co-expression network
After the data preprocessing, a total of 5414 genes were selected for WGCNA(Supplementary Figure S1). First, it’s an essential step to choose the appropriate soft thresholding power to ensure a scale-free network. By the network topology analysis function, the power value 6 was selected (Supplementary Figure S2) and used to construct the co-expression module. Seventeen distinct gene co-expression modules were constructed and shown in different colors (Fig. 2A). The number of genes in the 17 modules was shown in Supplementary Table 1. Figure 2B demonstrated the topological overlap matrix (TOM) of the 5,414 genes, indicating that each module and the genes expression in each module were relatively independent. Furthermore, we also plotted the clustering dendrogram, according to the module correlation and the heatmap according to adjacencies (Fig. 2C), meaning that those modules were largely divided into two clusters.
3.2. Construction of Consensus Weighted Co-expression Network
Since the overall connectivity index generally drops sharply with the increase of the soft-thresholding power, it is advantageous to select the lowest power that meets the approximate scale-free topology standard. Supplementary Figure S3 shows that approximate scale-free topology is obtained around the soft-thresholding power of 6 for both sets.
Nineteen different gene co-expression modules were constructed, shown in different colors in SSS 1C, and related to external microarray sample information, sJIA patients and healthy people. In each of the two sets, consensus module eigengenes would be related to the traits, respectively. To summarizes the two sets into one measure of module-trait relationships, we took the correlation that had the lower absolute value in the two sets if the two correlations had the same sign, and zero relationship if the two correlations had opposite signs (Fig. 3A). We checked these genes in the modules related to clinical features, which were basically in line with our previous results
Then we compared the correspondence among individual dataset modules, merged dataset modules, and consensus modules. Figure 3B1-B6 demonstrates the two data sets are indeed very similar. The preservation heatmap and barplots indicate that most relationships are very highly preserved and the overall preservation of the two eigengene networks is 0.89. Figure 3B7 showed the number of genes overlapping between the merged dataset modules (our previous method) and consensus modules was very high, and the hub genes obtained in the previous WGCNA modules were all in overlapping genes.
3.3. Identification of Co-expression Modules related to Normal or sJIA
The module-trait correlation coefficients in Fig. 2D illuminated that the red module and the green-yellow module revealed a high correlation with disease status. The red module was best positively correlated with the sJIA-related module (r = 0.8, p = 3e − 29), while the green-yellow module was negatively related to the sJIA (r = 0.62, p = 1e − 14). The scatterplots in Fig. 4A and 4B also showed that the gene significance (GS) and module membership (MM) value were of high correlation in the red module (cor = 0.85, p = 8.8e − 86) and the green-yellow module (cor = -0.59, p = 5.3e − 16). The results suggested that the genes in the two modules were probably related to the disease status, which were suitable for further analyses and mining of the hub gene.
3.4. Function enrichment analysis
Function enrichment analysis conducted by DAVID was performed on the genes in the two constructed modules. There was a significant difference in the biological processes of genes in the sJIA and non-sJIA modules. The detailed information was displayed in Fig. 4G and 4H.
For the red module, GO biological process (BP) annotation showed that the gene products mainly enriched in activation of immune response, infection, nucleosome and erythrocyte. As to GO molecular function (MF) annotation, protein heterodimerization and oxygen transporter were the most enriched terms. And enriched GO-CC terms were mainly involved in extracellular exosome, nucleosome, hemoglobin complex, and extracellular space. The results of KEGG enrichment showed the module was similar to Systemic lupus erythematosus (gene count = 11, p = 3.1e − 5). For the green-yellow module, the GO-BP annotation was mainly enriched in the immune response and inflammation. And receptor activity was the top enriched GO-MF terms, with plasma membrane enriched in GO-CC terms. Similarly, the KEGG terms were mainly about “Antigen processing and presentation” and “Natural killer cell mediated cytotoxicity” (Fig. 4C)..
We also used the packages anRichment and anRichmentMethods to do GO enrichment analysis in the whole modules and select the top GO term in each module to draw a bar graph (Fig. 5A& Fig. 5F). We further analyzed the functional enrichment of genes in several other relatively important modules: yellow, salmon, purple, cyan. As the Fig. 5B-E showed, the cyan module is mainly related to response to external stimulus, the purple module is mostly about the function of platelet alpha granule, involving some pathways such as wound healing, coagulation, and hemostasis etc, the salmon module may play an important role in the cell cycle process, and the yellow module is associated to transcriptional regulation.
3.5. Identification of Hub Genes
The 30 top-ranked hub genes in the two modules were shown in Cytoscape (Fig. 4D and 4E). By enrichment analysis shown in Fig. 4F, the hub genes from the red module were largely related to erythrocyte differentiation (ALAS2, AHSP, KLF1, TRIM10, TRIM58), and the hub genes from the green-yellow module were largely involved to immune responses, exemplified by genes such as KLRB1, KLRF1, CD160, KIR2DL1, KIR2DL2, KIR2DL3, KIR3DL1, SH2D1B, GZMA, TGFBR3.
Only one GWA study was previously conducted in sJIA[22], and we obtain the disease susceptibility genes from this manuscript and the GWAS catalogue. In order to further compare the hub genes with the GWAS genes, we adopted the PPI protein network analysis and functional enrichment. The Fig. 6A showed, the sJIA-susceptible genes (HLA-DRA, TRIM58, LDB2, TAPT1) may be related to the hub genes from the red module, with TRIM58 was also a hub gene of the red module, and those (LGMN, JPH3) may be related to the hub genes from the greenyellow module. The Fig. 6B-C demonstrates that the comparative analyses of the functional enrichment, such as tissue morphogenesis related to GWAS genes and the hub gene s of the red module, endothelial cell migration associated to GWAS genes and the hub genes of the greenyellow module.