1. Bioinformatics analyses
Based on our criteria of the microarray study described in the method section, GSE29431, GSE 10810, and GSE42568 studies were identified, and differential expression analysis revealed a total of 200 genes that had significant changes greater than two and Adj.p.Value less than 0.01 were selected from per dataset (Figure 1). Using a Venn diagram, 35 common genes were extracted from the mentioned datasets (Figure 2). These genes had significant expression changes in all three studies, therefore very likely that changes in their expression will be helpful in the pathogenesis of breast cancer for the detection of some genes whose expression levels have not been studied yet in detail. In the process of breast cancer, literature mining has been done, and 6 out of 35 genes, including MYZAP, BHMT2, PPP1R1A, PLAC9, LOC284825, and SPX, was obtained which their role in breast cancer has not been studied yet. Following to evaluation of assays that they support a potential role of the gene in breast cancer, according to the function of the protein expression of BMHT2 gene in the process of Homocysteine to Methyonin conversation and it is a potential role in the process of the Hyperhomocysteinemia that its importance in the breast cancer progression was confirmed before, investigation of the expression of this gene was selected to continue the study process (Figure 3, 4).
Functional enrichment analysis related to the 35 common genes were obtained, revealed the involvement of these genes mainly in response to peptide hormones and pathway enrichment analysis, introduced Glucagon signaling pathway as the top pathway (Figure 6).
Through the dpSNP web tool based on MAF between 0.1-0.5, the SNPs which had an effect on 3'UTR of the BHMT2 gene were obtained. RS10944 demonstrated the highest MAF score. Following the investigation of seed region and nucleotides around the SNP, hsa-miR-542-3p in the case of occurrence of SNP, it seems the nucleotide alteration affects the pattern of the miR function. Also, using the RNAhybrid database, the minimum free energy of miRNA and target hybridization was measured, and the results showed that in the case of occurrence of mutation led to the free energy of the miRNA became more negative, and the probability of binding increased (Figure 7). Investigation of the expression of miRNAs in GSE40525 was performed using the limma package in the R environment, and the expression level of hsa-miR-542-3p was reported to be significantly increased.
Using the Co-lncRNA database, based on TCGA information, analysis has been done to identify the relationship among genes and lncRNAs in breast tumor tissue compared to normal; the complete list of lncRNAs that had an effect on the BHMT2 gene was collected. Totally 130 lncRNAs were obtained, which among them, lncRNA called LINC00922 was selected based on the confirmation of its role in the process of breast cancer pathogenesis due to previous articles (according to the results of this study, the p-Value of this lncRNA in tumor samples is very acceptable and it was 0.0004). It is noticeable that this lncRNA was measured only by the RNAseq method in breast cancer tissue samples, and it was reported as increase expression. For the first time in the Isfahan BC population, the expression level in breast tumor tissue samples was examined by the real-time method in this study.
miRNA-mRNA interaction analysis by miRWalk online software revealed that BHMT2 could have a remarkable interaction with 9 hub miRNAs shown in (Figure 8).
Obtained results from in silico database predicted the association of rs10944 in BHMT2 3'UTR within has-mir542-3p and T allele is minor allele. Therefore, G to T replacement single nucleotide in the BHMT2 transcript could alter binding reaction (affinity), which might change BHMT2 regulation in transcription level. To prove this, based on the HRM method, we concerned allelic frequencies for rs10944. However, no significant differences were found between case and control.
Survival analysis of the TCGA RNA-seq data was performed by GEPIA2 online software. This analysis revealed that higher expression of LINC00922 is correlated with the lower survival rate of this lncRNA (Figure 9). This analysis can confirm the hypothesis that the gene is oncogenic in breast cancer.
2. Real time PCR expression data analysis
According to real-time PCR data analysis, BHMT2 had a significant down-regulation in the tumor samples compared to normal samples (P-Value: 0.0060). Also, this analysis revealed that lncRNA LINC00922 had a significantly high expression in Breast cancer samples (P-Value: 0.0264, Figure 10). Correlation analysis of the tumor expression data of BHMT2 and LINC00922 revealed that these two RNAs have no significant correlation in the Isfahan Breast cancer population (r: -0.1581, P. Value: 0.4605).
ROC analysis revealed that LINC00922 could be a prognosis biomarker for distinguishing Breast cancer samples from normal samples (AUC: 0.7205, P-Value: 0.0088, Figure 11).
3. Genotype frequency analysis for rs10944
Analysis of three different CC, CT, and TT genotypes of rs10944 SNP revealed that there is no significant difference between the frequency of these three genotypes in control and tumor samples (Table 3, Figure 12). P-Value was calculated by the Pearson chi-square test. The clinicopathological characteristics of the patients is provided in the Table 2.