3.1 Data acquisition and differential analysis
Clinical information, mutation data and the gene expression dataset (HTSeq-FPKM) for LGG were obtained from TCGA. To get an an external validation set, we also downloaded datasets and clinical data for mRNAseq_693 and mRNAseq_325 from the CGGA website. For the acquisition of m6A-regulated genes we drew on prior publication[71]. Immune-related genes were extracted from the GSEA ( Gene Set Enrichment Analysis ) website. All samples were obtained from tissues obtained by clinical surgical excision from primary glioma patients. All data have been pre-processed by the sva and limma packages.
To obtain m6A-associated immune genes, we calculated correlations between immune genes and m6A-regulated genes and further selected the resulting results, which calculated a |correlation coefficient| > 0.4 (p-value < 0.05). The Kaplan-Meier method and univariate Cox analysis were then used to complete the survival analysis manipulation of the m6A regulated genes, which in turn led to the desired hazard ratio (HR) value as well as its P value. A package(the igraph package), with a correlation coefficient greater than 0.8, was used to plot the co-expression network between these two. Then to obtain a heat map that expresses the differences between the tumour and normal groups, we used the limma package, as others have done, to identify m6A-related immune genes(false discovery rate [FDR] < 0.05 and |log fold change [FC]| > 0.5).
3.2 Prognostic modelling, mutation analysis and dynamic nomograms
We divide all the samples of LGG in the TCGA database into two groups, the training group and the internal validation group, while looking for the corresponding external validation group in the CGGA database. To obtain statistically significant m6A-related immune genes for prognosis, we analysed the training group correlation data using multivariate Cox stepwise regression analysis, univariate Cox regression and the LASSO algorithm. In addition to this, highly correlated genes were excluded in order to optimise the prognostic model. In order to distinguish between the high-risk and low-risk groups, we found the median risk scores of the training group and divided them accordingly, and next we plotted receiver operating characteristic (ROC) curves and survival curves that test the predictive power of the model.To determine whether the riskScore obtained from the model could be used as an independent prognostic factor, we conducted further univariate and multifactorial prognostic analyses on the training group (p<0.05).
After constructing the prognostic model, we then visualised and analysed the data from the maftools package by producing mutation waterfall plots. We next analysed the effect of mutational load on survival in both groups. In order to make our prognostic model more clinically useful, we have downloaded the DynNom software package on the web and created the corresponding web-based dynamic histogram application, which will help us to determine the prognosis of our patients more accurately.
3.3 ssGSEA analysis and cluster typing
The samples from both TCGA and CGGA databases were analysed by ssGSEA using t he GSVA software package to find the number of immune cells contained in each sample. To distinguish between high and low expression of m6A-regulated genes, we performed a cluster analysis of the obtained ones separately using the ConsensusClusterPlus package. We also plotted the violin, heat map and survival curve of the tumour microenvironment for the above three groups in order to find the correlation between m6A and immunity.
We then performed a cluster analysis of m6A-associated immune genes affecting prognosis (P<0.05, top 50), using the same method, to obtain groups with high, medium and low expression of immune genes. We then also plotted violin plots, heat maps and survival curves of the tumour microenvironment capable of expressing correlations between m6A-related immune genes and immunity. Finally, we mapped multigene enrichment curves by using the org.Hs.eg.db R package. Our results benefit from a significant enrichment of five Gene Ontology (GO) terms in the immune gene high expression group (compared to the immune gene low expression group) and the Kyoto Encyclopedia of Genomes (KEGG) pathway (p < 0.05).
3.4 m6Ascore calculation and Sankey plotting
We used the limma package to locate differentially expressed genes in the low, medium and high expression groups of m6A-regulated genes, and then took their intersections using the VennDiagram package to obtain the final list of differential genes. Prognosis-related genes were then screened again by the one-way COX method (p<0.05). After obtaining information on the expression of prognosis-related genes, we calculated the m6Ascore for each sample using the PCA method. The resulting m6Ascore is then sorted into high and low groups using the surv_cutpoint function, and the corresponding survival curves are plotted.
We plotted Sankey diagrams to visualize more the relationship between immune gene type, immune gene type, m6Ascore and riskScore.Correlation matrices for m6Ascore and immune cells were also generated using the corrplot package.Finally, we determined the correlation between them.
3.5 Comprehensive analysis for model genes
Further, we have also collated correlations between individual model genes and parameters such as stem cell indices, immune subtypes, tumour microenvironment and clinical stage. In addition, to validate the prognostic role of the model genes, the Gene Expression Profile Interaction Analysis (GEPIA) database was selected for analysis. Finally, to validate the accuracy of the prognostic model, we used a 5-year ROC curve, which enables comparison of risk values, tumour inflammatory characteristics (TIS) scores and TIDE, and also assesses the prognostic predictive efficacy between them.
3.6 Cell Lines and siRNA Transfection
The human astrocyte cell line HA1800 and the glioma cell lines U87, U118 and U251 are all from the regular source - the American Type Culture Collection (ATCC). The culture temperature we set up was 37°C and the concentration of CO2 was 5%. Our culture conditions were in Dulbecco's modified Eagle medium (DMEM) (HyClone, USA) with 10% fetal bovine serum (Gibco, USA). Our siRNA (100nM) was purchased from RIBO Biotechnology and cells were transfected with it after 24-48 hours of incubation using InvitroRNATM transfection reagent (InvivoGene Biotechnology, China).
3.7 qRT-PCR
We used siRNA-FBXO4 and negative control (NC) siRNA to compare with each other, followed by RNA extraction of four cell lines, HA1800, U87, U118 and U251. After reversal, cDNA was obtained and subjected to real-time PCR analysis. The primers involved in the manipulation were as follows:
FBXO4-Forward: GTCTTCGGGAAGACCATTGTTGG;
FBXO4-Reverse: GAGTTTCAGCCTCTGTATCCTGG;
β-actin-Forward: GACCACAGTCCATGCCATCA;
β-actin-Reverse: GTCAAAGGTGGAGGAGTGGG.
3.8 Cell proliferation
CCK-8 can be used to detect the proliferation of cells. We cultured U87 and U251 cells transfected with siRNA FBXO4 for 48 hours in a 96-well plate. In 96-well plates, 2000 cells were cultured in each well with 100 μl DMEM containing 10% FBS. We examined the value-added intensity of the treated cells every 24 hours and obtained data for 5 days. As in the instructions, we added equal amounts of CCK8 reagent (Yeasen, Shanghai, China) to each well of the plate and after 1 hour obtained absorbance values at 450 nm using a microplate reader (BioTek, USA).
3.9 Migration assay
The Transwell assay was used to examine cell migration. For the Transwell assay, we inoculated 40,000 cells into the upper chamber of a Matrigel transwell plate (BD Biosciences, USA). After 24 hours of incubation, the inserts were cleaned with 1× phosphate buffer solution and the cells were fixed in 95% ethanol and finally stained with 1% crystal violet for 20 minutes at room temperature. Finally, we photographed the perforated cells in their microscopic state and then used imageJ to calculate the cells crossing the membrane.
3.10 Statistical analyses
We use R software (version 4.0.3) for routine statistical analyses. Some of the analysis data was also obtained from publicly available information bases and special software. Graph Pad (version 8.0.1) was also used for some of the graphs. The construction of the prognostic models relied on univariate Cox regression, multi-factor Cox stepwise regression and the LASSO algorithm. When p<0.05 was considered statistically significant in terms of.