Results of Linear Models and Principal Component Analyses
Table S1 (supplementary materials) has details related to all comparisons that we have made among four subject groups (HAD vs. control, HIV vs. control, HIVE vs. control, HAD vs. HIV, HIVE vs. HAD, HIVE vs. HIV) across all three brain sectors; basal ganglia, frontal cortex, and white matter identified using linear models. The up-regulated genes among a subject group compared to the other group in Table S1 are denoted by “1”, down-regulated genes are denoted by “-1” and genes with no difference are denoted by “0”.
Figure 2 shows a representation of the DE genes among HIV, HAD and HIVE groups compared to controls within the three brain sectors. According to Fig. 2, subjects with HIVE had the most DE genes in all three brain sectors. Many of the DE genes among the HIVE group within the frontal cortex were downregulated, while many of them were upregulated in the white matter and basal ganglia. Subjects with HIV, but without either HAD or HIVE, had 11 upregulated genes common with HIVE subjects in the frontal cortex. The details of these 11 genes can be found in Table S1. Those are, IFI44L
*
, DTX3L, GBP1
*
, IFIT3
*
, AQP1, MX1
*
, PLSCR1, RARRES3, SPARC, TIMP1, and SERPING1. Genes marked with * are interferon response genes (Boehm and others, 1997), which are related to the cell functions in the immune system and autoimmunity. As per above mentioned DE gene selection criterion, subjects with HAD did not exhibit any DE genes over control subjects across all three brain sectors. In fact, this makes it harder to detect genomic differences among HAD patients compared to healthy subjects. As per Table S1, CIRBP was the only gene which was down-regulated among HIVE compared to HAD subjects within basal ganglia, while BTN3A3
*
, BTN3A2, IFIT1, IFIT2, IFIT3
*
, MX1, PARP14, HERC5, STAT1, and ISG15
*
were upregulated among HIVE compared to HAD within the same brain region (Refer Table S1 for a full list of upregulated genes). There were no down-regulated genes among HIVE compared to HIV in basal ganglia while BTN3A3
*
, GPNMB, BTN3A2, HERC5, CKLF, PARP14, IFIH1, GUSBP11 were up-regulated among them. There were no up or down-regulated genes among HAD compared to HIV in basal ganglia CDH13, CDK7, CITED2, TBR1, CIRBP, GCA, CMAS, RBM3, CTXN3, SST were there among the 145 down-regulated genes while IRF9, BTN3A3
*
, CPQ, GBP1, IFIT2, IFIT3
*
, MX1, HERC5, PARP14, ISG15
*
were there among 63 upregulated genes within HIVE subjects compared to HAD in frontal cortex. CDH13, CITED2, TBR1, ORC6, PARM1, C3orf80, RBM3, CTXN3, SNRPB, SST are some of the genes which were down-regulated and RFT2, GAREM2, ATP10B, IGKC, LINC00320, FOXO4, LIN28A are some of the genes which were upregulated among HIVE compared HIV in the same brain region. None of the genes were up or downregulated among HAD compared to HIV.
The genes BTN3A3
*
, GPNMB, BTN3A2
*
, IFIT3
*
, LY6E, ISG15
*
were upregulated whereas LRRCC1 and OPALIN were downregulated within HIVE subjects compared to HAD within the white matter. Surprisingly, there were no up or down-regulated genes among HIVE or HAD compared to HIV within the white matter.
Figure 3 (a) represents the distribution of subjects within each brain region with respect to the corresponding top three principal components. No clear groupings were observed among the four subject groups. Varimax rotation of PCA (Kaiser, 1958) did not add many changes to the original PCA patterns among subjects, as seen in Fig. 3.b.
Results of SAM Analysis
Table S2 (supplementary materials) contains DE genes identified through the SAM analysis. Similar to the linear model analysis, six comparisons were made among four subject groups within each brain sector, and the corresponding SAM score (d), fold change, and the q-value of each DE gene were reported. Here, we focused on the DE genes among HIVE compared to both HAD and HIV and HAD compared to HIV subjects with the intention of finding possible biomarkers to detect HAD (and hence HAND) in its early stages.
There were no down-regulated genes among HIVE compared to HAD subjects within basal ganglia. In contrast, 58 genes were upregulated among HIVE compared to HAD with in the same brain sector. Some of those genes are, ISG15, BTN3A2, BTN3A3, GPNMB, IFIT3, IFIT1, IFI44, IFIH1, IFI44L, GBP1, STAT1, IFIT2, HERC5, HERC6, GBP2, MX1, GBP3, and IRF9. The only downregulated gene among HIVE and HIV was ZNF808. Some of the upregulated genes within those two subjects’ groups were; IGKC, GUSBP11, GPNMB, RGS1, RCC2, GBP3, BTN3A3, BTN3A2, IFIH1, IFI44, IFIT1, ISG15, CD14, STAT1, HERC5, GBP1. The genes ZNF808, IFI6, IFIT3, ISG15, were upregulated between HAD compared to HIV within the same brain region while no down-regulated genes were found.
Within the frontal cortex, 143 genes were down-regulated while 45 genes were up-regulated among HIVE compared to HAD subjects. LOC105370687, PCYOX1L, WDR54, PCSK1, CDH13, DHRS11, ENPP5, NPTX2, ORC6, STAT4, SST, PWAR5, CIRBP, CITED2, and RBM3 were some of the down-regulated genes and IFIT2, SALL1, LGALS3BP, CPQ, GBP1, IFI44L, DTX3L, CYP2J2, PARP14, BST2, ISG15, BTN3A3, IFIT3, IRF9, STAT1, GBP3, FOXO4, IFIH1, MX1, and GPNMB were some of the up-regulated genes. UBE2T, PWAR5, and SST were the only downregulated and SALL1, CDK18 were the only upregulated genes among HIVE compared to HIV. BTN3A3, MX1, IFIT3, IFI44L, PLSCR1, STAT1, GBP1, IRF9, IFI6, GADD45A, GPNMB, BST2, ISG15 are some of the downregulated genes among HAD compared to HIV, and none of the genes were up regulated between those two subject groups. A comparison between within white matter showed 48 down-regulated and 26 up-regulated genes among HIVE compared to HAD subjects. OPALIN, CDR1, CIRBP, LOC105377335, GSTZ1, ALDH1A1, LINC01102, SEPT10, LRRCC1, ZNF808 were there within the set of down-regulated genes among HIVE subjects. GPNMB, BTN3A3, ISG15, LY6E, IFIT3, BTN3A2, CMPK2, STAT1, IFIT1, IFIH1, GBP2, MX1, HERC6 were there among the up-regulated genes within the same subjects. OPALIN, LRRCC1, SEPT10, RBP1, CDR1, ALDH1A1, CNN3, CIRBP were some of the down-regulated genes while GPNMB, ISG15, SEPHS2, BTN3A3, IFIT1, IFIH1, RCC2, GATB, BTN3A2 were some of the upregulated genes among HIVE compared to HIV within the white matter. HAPLN2, LOC101930085, PDE1A, LOC100128079, CHADL, ENPP6, GATB, RSRP1 are some of the upregulated genes among HAD compared to HIV while none of the genes were downregulated between them. Details of the full list of genes that were up and downregulated among different subject groups across three brain sectors can be found in Table S2.
Results of RF Method 1
Three best RF models created using RF method 1 for basal ganglia, frontal neocortex, and the white matter had OOB error rates 47.36%, 42.34%, and 47.36%, respectively. Due to the limited subjects that we had in our sample, it was not possible to reduce the OOB error rates more than these. Genes within each brain sector were ranked according to the mean decrease in accuracy and given in Table S3 (supplementary materials). The higher the mean decrease associated with a gene, the more important it is incorrectly predicting that subject group.
Figure 4 depicts the top 25 genes identified within basal ganglia using RF method 1. Table S3 contains the ranked genes as per the mean decrease in accuracy along with corresponding fold change and q values related to SAM analyses. This helps to assess the relative importance of each gene based on the RF method-1 and SAM between every two subject groups. Although our discussion here is focused on identifying DE genes among the subject groups HIVE vs HAD and HAD vs HIV, readers can refer Table S3 to find out any other important relationships among different subject groups.
As mentioned in Sec 3.2, there were no down-regulated genes within basal ganglia among HIVE compared to HAD. Among the top 25 genes identified by RF method 1, the genes IFIT2, IFIH1, LGALS3BP, HERC5, PSMB8, STAT1, BTN3A3, BTN3A2, MX1, and IRF9 were up-regulated at least by a 3-fold and had q values < 0.03 within HIVE compared to HAD subjects. Among those, BTN3A2, IFIH1, and STAT1 had at least a 6-fold change.
IFIT3, ISG15, IFI6, and ZNF80 were the only four genes that were downregulated HAD compared to HIV and none of them were ranked within the first 25 of RF method 1. Among these four, IFI6 and ZNF808 were significantly downregulated with a q value < 0.00. IFI6, a protein-coding gene that is related to cytokine signaling in the immune system pathway and it is associated with many diseases including Hepatitis C (Chen, Li and Chen, 2016, Qi and others, 2017) and Dengue virus-2 (Qi and others, 2015). This gene was not DE within HIVE and HIV subjects. Nevertheless, it was significantly up-regulated (fold change: 9.12 with a q value: 0.00) among HIVE compared to HAD and hence needed to be further investigated due to its potential to be used a biomarker to detect HAD patients in their early stages (This gene was ranked #69 according to the gene ranking in RF method 1). The top 25 genes identified by RF method 1 within the frontal cortex are given in Fig. 5. SST, which was ranked #7 by RF method 1 was significantly down-regulated with a q-value 0.00 among HIVE compared to HAD subjects. The genes, MX1, MAP4K4, CPQ, SALL1, LGALS3BP, PARP14, and IFIT2, were significantly up-regulated with q- values < 0.04 among them. In fact, MX1 and GADD45A were significantly down-regulated with q-values < 0.00 among HAD compared to HIV. Since GADD45A was down-regulated only between HAD and HIV, this need to be further investigated to be used as a potential biomarker.
The top 25 genes identified within white matter using RF method 1 is given in Fig. 6. BTN3A3, ISG15, IFIH1, MX1, and IFIT1 which were among the top 25 ranked genes, were significantly upregulated with at least 4 folds and q-values < 0.03 among HIVE compared to HAD subjects. Although not ranked within the first 25, RBP1, LRRCC1, OPALIN, CPAMD8, LOC100506114, CIRBP and RBM3 were there among the 48 down-regulated genes within HIVE compared to HAD. None of the other top 25 genes were significantly up or down-regulated (with a q-value < 0.05) within white matter among HIVE and HIV or HAD and HIV subjects. A set of genes within the first 100 rankings, LOC101930085, LOC100128079, HAPLN2, and ENPP6, had at least 2-fold change, but with q values < 0.09 (these are not < 0.05) among HAD compared to HIV subjects. Even though they were not significant at the 5% level, it might worth further examine them as they were not significantly up or downregulated within any other subject groups. For example, HAPLN2 has been identified to play a pivotal role in the formation of the hyaluronan-associated matrix in the central nervous system, which facilitates neuronal conduction and general structural stabilization within the white matter of mutant mice (Bekku and others, 2010). However, we did not find any detailed research on the other three genes.
Results of RF Method 2
Here, we present the results of our second random forest model based on the most informative clusters that we found out using the GXNA tool. The corresponding rankings of genes within each cluster are also given in Table S3. Note that, we present the results of two clusters for each brain sector.
Figure 7 depicts the genes expression pattern of basal ganglia given by RF method 2. The two graphs given here are related to the top 2 ranking gene subnetworks or the clusters that we have identified through GXNA. One cluster had eight genes, and the other one had just four genes. STAT1 and IRF9 within the first cluster and the LGALS3BP from the second cluster were significantly upregulated within the HIVE compared to HAD with q-values < 0.03. None of the genes within two clusters were DE both among HAD and HIVE compared to HIV under q-value < 0.05.
Figure 8 depicts the genes expression pattern of the frontal cortex given by RF method 2. FOX04 within the first cluster was significantly up-regulated with a q-value 0.04, and COL5A2 within the second cluster was significantly down-regulated among HIVE compared to HAD within the frontal cortex. GADD45A within the first cluster was the only gene that is significantly down-regulated among HAD compared to HIV with a q-value 0.00. None of the genes within two clusters were DE among HIVE compared to HIV.
Figure 9 depicts the genes expression pattern of white matter given by the RF method 2. RBP1 was identified to be significantly down-regulated while, and MX1 was significantly up-regulated within the white matter with a q-values < 0.04 among HIVE compared to HAD. RBP1 also significantly downregulated among HIVE compared to HIV subjects. Surprisingly, the bottom cluster did not show any DE regulated genes among any of the subject group comparisons.
Model Validation
We evaluated all the random forest model that we developed using a testing data set based on model accuracy, macro precision, and macro recall rates. The testing data set consists of one control, one HIV, two HAD, and one HIVE subject making a total of five. Table 1 summarizes the model evaluations in terms of the OOB error rates of the training data set and accuracy, macro precision, and macro recall rates for the testing data set. The resulting RF models have reasonable OOB error rates.
Table 1
Random Forest Model Evaluations
Brain Region | Model | OOB Error Rate(%) | Macro Precision | Macro Recall | Accuracy |
Basal Ganglia | RF Method 1 | 47.38% | 0.63 | 0.63 | 0.60 |
RF Method 2 (Cluster 1) | 57.89% | 0.52 | 0.50 | 0.40 |
RF Method 2 (Cluster 2) | 47.38% | 0.52 | 0.50 | 0.40 |
Frontal Cortex | RF Method 1 | 57.89% | 0.56 | 0.50 | 0.60 |
RF Method 2 (Cluster 1) | 47.38% | 0.75 | 0.50 | 0.60 |
RF Method 2 (Cluster 2) | 52.63% | 0.75 | 0.50 | 0.60 |
White Matter | RF Method 1 | 47.38% | 0.67 | 0.63 | 0.60 |
RF Method 2 (Cluster 1) | 57.89% | 0.89 | 0.75 | 0.80 |
RF Method 2 (Cluster 2) | 57.89% | 0.25 | 0.25 | 0.40 |
Supporting Documents: |
Although the other model evaluations are not that convincing with low accuracies, micro-precision, and micro recall rates within the testing data set. It is important to note that one misclassification would contribute to a large reduction in accuracy, precision, and recall rates due to this small sample size. In fact, one misclassification within controls, HIV and HIVE subject can result in zero precision or recall rates for that class. Hence, it is important to test these RF models on a large sample. Moreover, we have noted that is RF methods 1 and 2 likely to have resulted in incomparable accuracy, precision, and recall rates.
Discussion:
Patients with HIV-1 infection have displayed different levels of neurocognitive impairment; some show mild impairment, while others appear to be affected more severely impaired (Dufour and others, 2018, Levine and others, 2013). If the impairment is severe enough, it might trigger an inflammatory response in the brain that leads to cell death with this comes difficulties with performing fine motor skills such as the finger-tapping task. The results have provided us with a better insight into a potential biomarker for HIV patients with different severity levels of HAND.
Although we have used the gene expression data from the same brain specimen as Gelman et al. (Gelman et al., 2012) we have applied a non-specific gene filter and hence ended up with a different number of DE genes within each brain sector. In contrast to our analysis, they have seen a clustering pattern among the four subject groups after performing principal component analysis. This can be due to the gene filtration effect.
Based on our analysis, the two brain regions that show the highest gene expression activities are frontal neocortex and basal ganglia. Frontal neocortex involves higher functions such as sensory perception, generation of motor commands, spatial reasoning, and conscious thoughts. Basal ganglia is responsible for motor function and other functions such as motor learning skills, executive functions, and emotions.
As per linear model analyses, CIRBP was significantly down-regulated while BTN3A3, IFIT3 MX1, HERC5, PARP14, ISG15 significantly up-regulated among HIVE compared to HAD both within basal ganglia and frontal cortex. RBM3 was found to be down-regulated among the same subjects within the frontal cortex.
SAM analyses reveals that genes, CIRBPRBM3, ZNF808, CDR1, THAP9-AS1, LOC1001928307 were down-regulated among HIVE subjects compared to HAD within frontal cortex and white matter whereas IFI44L, BST2, ISG15, BTN3A3, IFIT3, STAT1, IFIH1, MX1, and GPNMB were up-regulated among HIVE subjects compared to HAD across all three brain regions. There were no any common down-regulated genes across all three bran regions among HIVE and HIV while GPNMB, ISG15, BTN3A3, LOC10041958, IFIT1, IFIH1, RCC2, and BTN3A2 were up-regulated within basal ganglia and white matter. IFI6, IFIT3, and ISG15 were found to be down-regulated among HAD and HIV across basal ganglia and frontal cortex. As CIRBP, RBM3, GPNMB, ISG15, IFIT6 genes were DE among the different subject groups with HAND, we find potentials to be used as biomarkers to detect HAND under further investigations.
Both CIRBP and RBM3 appear to play a role in the inflammatory response that leads to cell death in these patients and are associated with Alzheimer’s disease they are very similar RNA-binding proteins that are up regulated in response to hypothermia (low temperatures) both of which appear to play similar roles in the regulation of numerous cellular events (Lanciego, Luquin and Obeso, 2012). CIRBP gene appears to activate the Akt and Erk pathways in neurons that block mitochondrial apoptosis, preventing cell death during hypothermic conditions (Li and others, 2012, Zhang and others, 2015). The Akt pathway is activated through the use of a kinase, phosphoinositide-3-kinase (PI3K), once activated it enhances cell proliferation and survival (Hemmings and Restuccia, 2015); the Erk pathway is activated by the binding of receptor tyrosine kinase (RTK) to a ligand, this induces cellular proliferation and activation of transcription factors that aid in cell survival of under certain conditions such as stress (McCain, 2013). Both pathways play a role in the protection of cells t. Thus, we can conclude that CIRBP expression in the brain prevents neuron cell death during both hypothermic and oxidative stress conditions (Liu and others, 2015).
Similarly, RBM3 plays a neuroprotective role under mild hypothermic conditions; studies have shown that RBM3 expression is inversely related to neuronal apoptosis (Chip and others, 2011, Peretti and others, 2015). ISG15 activity is tightly regulated by specific signaling pathways that have a role in innate immunity. ISG15 has identified as an interferon-stimulated gene since its expression is induced in response to type I interferons or lipopolysaccharide treatment (Malakhova and others, 2002). PNMB has been reported to be expressed in various cell types, including melanocytes, osteoclasts, osteoblasts, dendritic cells, and it is overexpressed in various cancer types (Zhou and others, 2012). IFIT3 is related with IFN-induced antiviral protein which acts as an inhibitor of cellular as well as viral processes, cell migration, proliferation, signaling, and viral replication.
The gene, GADD45A, was consistently ranked among the top genes by both RF methods 1 and 2 within the frontal cortex brain region. Also, this was found to be significantly downregulated among HAD compared to HIV within the frontal cortex. This is a protein-coding gene whose transcript levels are increased with stressful growth arrest conditions (Li and others, 2018). Hence, researchers should give priority to investigating its potential to be used as an efficient biomarker.
Through our analyses, we were able to identify potential biomarkers in patients with HAND that could help detect the potential development of neurocognitive impairment before it occurs. Hyperthermia has shown to enhance HIV replication within the brain (Roesch and others, 2012); this is seen in patients with fever ranging from 38–40 ℃ and with this comes the inhibition of RBM3 and CIRBP genes with both appear to play a neuroprotective role. The gene expression of RBM3, CIRBP, GADD45A, and HIVE patients differed significantly from that of patients with HAND, which is why these genes can serve as potential biomarkers to diagnose neurocognitive impairments associated with HIV beforehand.