Dataset of candidate genes associated with immune response to bovine mastitis
Our initial search on PubMed using the keywords “bovine,” “mastitis,” “immunity,” and “genes” generated 276 articles. Following filtering, a total of 109 articles were selected and added to the pipeline for candidate genes identification. These articles, including experimental, meta-analysis, and reviews mentioned 919 individual genes associated with host immune response to bovine mastitis. Of importance to our analysis are genes with multiple mentions, including CXCL8 and IL-6 referenced in 41 and 29 articles, respectively (Fig 1b). Comparing the 919 genes found during our search with the 20 genes generated from Genomatix, yielded 16 genes that were common to both search methods (Supplementary Fig 1). These genes are MYD88, CD4, IL-10, IFNγ, IL-4, ICAM1, CXCL8, TLR4, TNFα, IL-18, TLR2, CD86, CCL2, IL-6, CSF2, and CD14. The interactions between these candidate genes reveal TLR4 and TLR2 are considerably more linked to other genes (Fig 2), while CD4 on the other hand had the least interaction.
miRNAs bind to the candidate genes in bovine mastitis
Our analyses explored the complementarily binding of miRNAs to the 3’UTR, CDS, and 5’UTR regions of the candidate genes. We report a total of 768 miRNAs binding either the 3’ UTR, CDS, or 5’ UTR of the 16 target genes (Supplementary Table 2). With BEG, five miRNAs were uncovered to bind all three regions within the same gene: bta-miR-2338 and bta-miR-2433 to CD14, bta-miR-2373-3p to ICAM1, bta-miR-2328-3p and bta-miR-2356 to TLR4 (Fig. 3). In addition, a substantial number of miRNAs bound to two or more regions within the same gene (Fig. 3), Displaying all the miRNA predicted to bind to three or more target genes is Fig. 4, which also depicts whether the miRNA binds one (green), two (blue), or three (red) of the possible regions (3’UTR, 5’UTR, and CDS). The complete list of miRNAs binding three or more target genes is depicted in Supplementary Table 3. Likewise, we found six miRNAs; bta-miR-24-3p, bta-miR-149-5p, bta-miR-223, bta-miR-185, bta-miR-874, and bta-miR-328, commonly predicted by the three software as shown (Supplementary Fig 2; Supplementary Table 2).
lncRNAs interact with mRNA target inferring their functionalities
NONCODE database search revealed 22 bovine lncRNAs that matched our prediction criteria. Genomic location and strand sense for each lncRNAs are shown in Supplementary Table 4. From the analysis, we identified 20 lncRNAs that bind to our target mRNAs with a ndG < -0.05 (Fig 5 and Supplementary Table 4), with 8 of them targeting more than 13 of disease associated genes. Of the 8 lncRNA, six (NONBTAT001181.2, XR_003029725.1, NONBTAT010129.2, XR_003030515.1, XR_003033296.1, NONBTAT027932.1) targeted all of the16 genes, while NONBTAT027932.1, XR_003033296.1, and XR_003029725.1 could bind at least one target gene with an average ndG greater than -0.08. NONBTAT001181.2-IL-18 and XR_003029725.1 showed a high affinity for IL-18 with thresholds less than -0.1. Additionally, NONBTAT027932.1 could bind MYD88, CD4, ICAM1, CD14, TLR2, and TNFα with binding free energy less than -0.1 (Supplementary Table 4). Eight lncRNAs that bind 13 or more candidate genes with lower ndGs were selected for network and conservational analysis (Table 3). Most of the identified lncRNAs are not yet annotated, therefore, we inferred their gene ontology and pathways based on their potential mRNA targets/genes. Identified ontologies and associated pathways are shown in Table 4 and Figure 6a and 6b.
Biological, molecular and cellular function of target genes and pathway analysis
The pathway analysis of 16 target genes identified 25 biological processes (Fig 6a). Myeloid leukocyte activation with a p-value of 4.92E-13 vascular endothelial growth factor (GF) production, regulation of chemokine (C-X-C motif) ligand 2 production and lipopolysaccharide (LPS)-mediated signaling pathway (p-value; 2.58E-13) (Table 4). Of importance are the three significant biological processes concomitantly identified from the three databases including LPS-mediated signaling pathway, regulation of IL-23 production, and positive regulation of chemokine production. DAVID analysis identified eight biological processes, three molecular functions and one cellular component, while 11 disease pathways were identified including malaria and rheumatoid arthritis that featured most candidate genes during bovine mastitis with p-values of 9.20E-19 and 4.60E-16, respectively (Fig 6b; Supplementary Table 5).
Catalog of transcription factors, biological processes and molecular function
Seventeen TFs were identified as significant by at least two software (Table 5; Supplementary Table 6). All 17 TFs appear to have the potential to regulate the transcription of the 16 genes including CD4, MYD88, ICAM1, TLR4, TLR2, IL-18, and CD86.The genomic location and other details of the 17 TFs are presented in Table 5. In addition, functional analysis identified 3 biological processes, 2 molecular functions, and 1 cellular component involving two to three TFs, with a p-value less than 0.0005 (Table 6).
Sequence conservation of bovine miRNA and lncRNAs in other species
Sequence analysis was performed to assess the evolutionary conservation of bovine lncRNAs and miRNAs across various species (Supplementary Table 1). We examined the significant miRNA (6) and lncRNA (8) identified previously with their homologous sequence from 14 other species. Our analysis of the six miRNAs reveal bta-miR-223 to be highly conserved in all 15 species, without a single deviant polymorphism (Supplementary Fig 3). The evolutionary closeness of the cow and wild yak is also seen in bta-miR-149-5p, bta-miR-328, and bta-miR-185, but are distantly related in bta-miR-24-3p and bta-miR-874 (Supplementary Fig. 4a-c). The phylogenetic analysis of bta-miR-223 defines cow having the closest identity with wild yak and the remaining ruminants (goat and sheep) as orthologs (Supplementary Fig 4b). The alignment of bta-miR-874 showed high conservation between cow, mouse, and megabat, with phylogenetic analysis having these three sharing a common node, while no other tree had these three species this close in proximity.
The length of lncRNA allow for greater variation within the gene throughout evolution. The evolutionary alignment analysis revealed that each lncRNA was not completely conserved in the 15 species; however, the majority have similar base pair sequences in conserved regions. Out of the 8 lncRNA, 7 have bovine lncRNA to be most closely related to wild yak, followed by other ruminants (Supplementary Fig. 5a-d). The tree of NONBTAT027932.1 show wild yak to be relatively distant from cow, sheep, and goat compared to the species relation in the other lncRNA analysis. Five out of eight lncRNA have human, gorilla, and chimpanzee clustered with the same origin.
Dysregulation of lncRNA-miRNA interaction on immune response in bovine mastitis
Our analysis showed several miRNA potential binding sites on lncRNA, based on the complementary base pairing and normalized binding free energies (ndG) values. Of interest, we found 8 out of 22 lncRNAs possessing binding sites for the entire length of these miRNAs, with the exception of 4 out of the 48 pairings (Fig 7). The ndG values range from -0.1774 to -2.875. lncRNA XR_003033296.1 and bta-miR-223 had the greatest ndG while lncRNA XR_234647.4 and bta-miR-328 had the lowest ndG. The lower the ndG, the greater the possibility that the two RNA will bind. The vast majority of miRNA bind their entire length to the lncRNA except four, the shortest binding length is 10 (out of 22 base pairs) between bta-miR-223 and XR_234647.4. Additional results on lncRNA-miRNA binding are presented in Supplementary Table 7.
Combined network and molecular interactome of lncRNAs, miRNAs, TFs and immune genes
Figures 8 through 11 show molecular interactions between miRNAs, lncRNAs, TFs, target genes and pathways. Figure 8 in particular show the connections between the bovine mastitis target genes, miRNA, and the pathways/biological processes of the genes. CD4 (36) and TLR4 (36) have the most miRNA targeting them, as opposed to IL-4, CSF2, and IL-18 (one, two and two miRNAs respectively). Some miRNAs are seen targeting more than one gene such as bta-miR-2294 (targeting CD4 and CD86) and bta-miR-2374 (targeting TLR4 and CD4). Interestingly, we found some miRNAs that bind to multiple genes involved in the same pathway, such as bta-miR-671 targeting TLR4 and IFNγ, both of which are involved in macrophage activation; bta-miR-2413 targeting IFNγ, IL-10, and TLR4, all of which are involved in the regulation of nitric oxide biosynthetic process.
The functional annotation of each lncRNA was inferred from the target genes using GO Enrichment Analysis and the DAVID KEGG algorithm program. Several of these lncRNAs were involved in multiple ontologies (Figure 9). Specifically, NONBTAT001181.2 could bind to IL-6 to regulate its expression, thereby perturbing functional pathways like vascular endothelial growth factor, regulation of cytokine production, which are involved in inflammatory response, and regulation of tyrosine phosphorylation of STAT protein pathways. LncRNA XR_003030515.1 targets TLR4 and is connected to cellular response to lipoteichoic acid, macrophage activation, NAD+ nucleosidase activity, and regulation of interleukin-23 production. NONBTAT027932.1 lncRNA targets CD14 and is involved in positive regulation of NIK/NF-κB signaling, regulation of interleukin-8 production, regulation of cytokine secretion, LPS-mediated signaling pathway, lipopeptide binding, and LPS receptor complex (Figure 9). Figure 10 shows the interconnections between the 17 TFs, 16 target genes, and the pathways they are involved in. Both CREB1 and JUN have numerous connections directed toward the target genes and pathways.
The final immunoregulatory integrated network in our analysis contains miRNAs, lncRNAs, TFs, target genes, and biological pathways and their extensive interconnections (Fig 11). We showed an important connectivity between lncRNA, miRNA, bovine mastitis immune genes and that their biological functions have all been mapped on the network. The outer circle contains mostly miRNAs, with the exception of 3 TFs, that bind only to a single target gene. The inner circle consists of miRNAs binding to more than one target gene, lncRNAs, TFs, target genes, and biological pathways. LncRNA are found densely in the top inner circle and can be sporadically located in the outer circle. Our results indicate that the majority of lncRNAs, miRNAs, and TFs interactions overlapped and were involved in several pathways including biological processes, cellular components, and molecular functions.