The purpose of this study was to find the CUB of genes, the similarities between the genes and phylogenetic analysis of the genes in two freshwater fish species i.e. Channa striata and Channa punctata.
3.1. Analysis of CUB
To investigate the degree of CUB, we determined the ENC values for the mitochondrial genes of Channa sriata and Channa punctata fish species. In Channa striata, the values ranged from 39 to 61, with an average of 48.46; and in Channa punctata, the values ranged from 42 to 57, with an average of 48.23. This implies that the average ENC value in both the species is greater than 35, indicating a low CUB (Butt et al., 2014).
In a previous study by Deb et al. (2020), they also found the mean ENC value > 35 in the different genomes of hepadnaviruses, supporting the result of our study in the two fishes (Deb et al., 2020).
3.2. Relative Synonymous Codon Usage (RSCU)
In Fig. 1, we had mentioned the RSCU value of each codon and compared the two mt-genomes. We discovered 8 codons in Channa striata that were over-represented (RSCU value > 1.6), while 14 codons were under-represented (RSCU value < 0.6). While, in Channa punctata, 9 codons were found to be over-represented and 18 codons were under-represented (Table 1).Table 2 shows the preferred codons (> 1) in the mt-genomes of Channa striata and Channa punctata.
Table 1
Over-represented and under-represented codons of Channa striata and Channa punctata
Channa striata | Channa punctata |
Over-represented codons | Under-represented codons | Over-represented codons | Under-represented codons |
TCC | TCG | TCC | TCG |
CTA | AGT | CTC | AGT |
CCC | TTG | CCC | TTG |
CGA | CTG | CAA | CTG |
ACC | CCG | CGA | TGT |
AAA | CAG | GTC | CCG |
GCC | CGG | GCC | CAG |
GAA | CGT | GAC | CGG |
| TGG | GGC | CGT |
| ACG | | TGG |
| AAG | | ACG |
| GCG | | AAT |
| GAG | | AAG |
| GGT | | GTG |
| | | GCG |
| | | GAT |
| | | GAG |
| | | GGT |
Barbhuiya et al. (2020) previously reported in their study that 29 synonymous codons were frequently used in both sets of obesity genes and housekeeping genes, and these codons mostly ended with base C or G (Chakraborty et al., 2020).
Table 2
Preferred codons of Channa striata and Channa punctata
Preferred codons
|
Channa striata
|
Channa punctata
|
TCA
|
TCA
|
TCC
|
TCC
|
TCT
|
TCT
|
TTT
|
AGC
|
TTA
|
TTC
|
CTA
|
CTA
|
CTC
|
CTC
|
CTT
|
CTT
|
CCA
|
TAC
|
CCC
|
CCC
|
CAC
|
CCT
|
CAA
|
CAC
|
CGA
|
CAA
|
TGA
|
CGA
|
ATA
|
CGC
|
ATT
|
TGA
|
ACA
|
ATA
|
ACC
|
ATT
|
AAC
|
ACA
|
AAA
|
ACC
|
GTA
|
AAC
|
GCA
|
AAA
|
GCC
|
GTC
|
GAC
|
GCA
|
GAA
|
GCC
|
GGA
|
GAC
|
GGC
|
GAA
|
|
GGC
|
3.3. Compositional properties
Previous CUB findings indicated that the total nucleotide compositional properties of a gene influence synonymous codon usage (Jenkins and Holmes, 2003). The total nucleotide abundance and base composition at the third codon location of the genes in both species were determined in this study (Fig. 2). The bases T (29%) and G (16%) were found in nearly equal proportions in both Channa striata and Channa punctata; however, the proportions of bases A (25% in Channa striata and 24% in Channa punctata) and C (30% in Channa striata and 32% in Channa punctata) were found to be slightly different. The overall GC% contents for Channa striata and Channa punctata were 45.82% and 48.02% respectively, while the overall AT% for both the mt-genomes were 54.18% and 51.98%, indicating that the mitochondrial genes of these two fish species are AT-rich. When comparison was made for the third codon position between the two mt-genomes, it was found that C (34.19% and 38.68%) was most frequent, followed by A (33.37% and 29.27%), T (23.46% and 22.86%), and G (8.98% and 9.18%). The overall GC3 contents were 43.18% and 47.88% for Channa striata and Channa punctata, respectively, and AT3 contents were 56.82% and 52.12%. This indicates that in both the genomes, the 3rd codon position was rich in AT content. GC content across codon position followed the trend GC1 > GC3 > GC2 in the mitochondrial genes of two genomes (Fig. 3). We analysed the correlation of ENC with all nucleotide bases, and the results are represented in the Table 3. From the correlation values we could conclude that base composition had an impact on CUB.
Table 3
Correlation between ENC and base content of genes in Channa striata and Channa punctata
| | A% | T% | G% | C% | A3% | T3% | G3% | C3% | GC% | GC3% |
ENC | Channa striata | 0.06 | 0.04 | -0.22 | 0.18 | -0.21 | 0.33 | 0.06 | -0.14 | -0.08 | -0.54 |
Channa punctata | -0.01 | -0.04 | -0.22 | 0.24 | 0.09 | -0.14 | 0.19 | -0.07 | 0.02 | -0.09 |
**Significant at p < 0.01, *Significant at p < 0.05 |
In a study by Barbhuiya et al. (2021) on the CO mitochondrial genes of amphibian orders i.e. Caudata and Gymnophiona, the base frequency was in the order A > T > C > G, but in Anura, the base frequency was in the order A > C > T > G. They also discovered that the total AT content in these genes was more than GC, implying AT richness of CO genes. They observed that the GC content in Anura was the highest; while in Caudata, they found the lowest GC content. From their findings, they concluded that Caudata had the highest AT percent while Anura had the lowest, but they found intermediate GC and AT contents in Gymnophiona among the orders (Barbhuiya et al., 2021b).
3.4. Correspondence analysis (COA)
We performed a correspondence analysis (COA) of the sense codons for the mt genes of Channa striata and Channa punctata (Fig. 4). Both the axes are significant contributors to the overall variation. Axis 1 accounted for 38.68% of the overall variation in Channa striata, while axis 2 accounted for 17.51%. In Channa punctata, axis 1 accounted for 43.05% of the overall variation, while axis 2 accounted for 16.70%. The red dots in Fig. 4 represent AT-ending codons, while the blue dots represent GC-ending codons. A close distribution of bases around the axes in the figures suggests that mutational pressure might have played a role in shaping the CUB of the genes in two mt-genomes (Wei et al., 2014b).
Parvin et al. (2020) in their study on mitochondrial ND genes of Amphibians found that almost all codons were very close to both the axes. This indicated that mutation-induced compositional characteristics may have played a role in determining the CUB. This supports the results of our study of CUB on the mt genes of the two fish species (Barbhuiya et al., 2021a).
3.5. Parity rule 2 plot (PR2)
In PR2 analysis, an equitable distribution of nucleobases across the plot explains the effect of mutation in CUB, whereas a disproportionate distribution depicts the involvement of both natural selection and mutational pressure. The study was carried out to investigate the impact of evolutionary determinants on CUB (Fig. 5). We plotted AT-bias (A3/ A3 + T3) on the vertical axis and GC-bias (G3/ G3 + C3) on the horizontal axis for this research and found an uneven distribution of bases across the plot which revealed that CUB was influenced by both the evolutionary forces (Deb et al., 2018).
Barbhuiyan et al. (2020) in their study discovered an unequal distribution of GC and AT contents in their parity plot analysis of both obesity and housekeeping genes. They reported that not only the mutational pressure, but also the natural selection as an evolutionary force might have an impact in determining the CUB of the mt genes in two species (Chakraborty et al., 2020).
3.6. Interrelationships among the nucleotide compositions
The two evolutionary forces that are primarily responsible for the disparity in codon usage bias are natural selection and mutation pressure (Mazumder et al., 2018a, b). We used Karl Pearson’s method for the correlation analysis of the nucleotide compositional properties to identify the major forces influencing the CUB of genes (Chen, 2013). In our study, we correlated overall nucleotide contents with the nucleotide contents at the 3rd codon position (Table 4) and observed a highly significant correlation at p < 0.01 or 0.05; implying that mutation pressure may have played a role in the CUB of the mt genes in two fish species (Zhang et al., 2013a; Zhao et al., 2007).
Table 4
Correlation study in Channa striata and Channa punctata between overall nucleotide contents and nucleotide contents in the third position of codons
| Channa striata | Channa punctata |
A3% | T3% | C3% | G3% | A3% | T3% | C3% | G3% |
A | .85** | − .67* | − .82** | .70** | .85** | − .46 | − .72** | .49 |
T | − .85** | .85** | .80** | − .78** | − .81** | .79** | .92** | − .81** |
G | − .90** | .73** | .96** | − .85** | − .70** | .76** | .94** | − .85** |
C | .89** | − .83** | − .95** | .90** | .57* | − .84** | − .93** | .92** |
GC | − .38 | − .08 | .41 | − .15 | − .38 | − .34 | − .03 | .307 |
**Significant at p < 0.01, *Significant at p < 0.05 |
Parvin et al. (2020) discovered a positive, highly significant correlation (p < 0.001) between A3 and A, T3 and T, G3 and G and, GC3 and GC, in an earlier study on codon usage trends in the genes associated with obesity; whereas they found a negative, significant correlation in the majority of the other different nucleotide composition pair. This suggested that the CUB was most likely influenced by mutation pressure. However, there was a significant positive correlation (p < 0.001) between G and C3, GC and T3. This indicated that in determining the CUB of genes that are associated with obesity, natural selection might have played a prominent role. Furthermore, in case of the housekeeping genes, highly significant positive as well as negative correlations were observed between the homogeneous and heterogeneous nucleotides. This indicated that both the evolutionary forces played a role in determining the codon usage pattern of the housekeeping genes (Chakraborty et al., 2020).
3.7. Neutrality plot
It is used to evaluate the effect of evolutionary forces as well as the association of GC3 with GC12 on the mt genes. A negative correlation between GC3 and GC12 (r = -0.19 for Channa striata and r = -0.16 for Channa punctata; p < 0.01) demonstrated the impact of directional mutation across all codons in our study (Sueoka, 1988). Furthermore, for the mt genes of both the species, we plotted GC3 and GC12 on the abscissa and ordinate, respectively, and generated a steady regression graph (Fig. 6). In general, natural selection is considered as more important than the mutational pressure if the regression coefficient (RC) value is less than 0.5; while mutational pressure is more important when the RC value is greater than 0.5. In our study, we discovered the RC values of -0.094 for Channa striata and − 0.075 for Channa punctate; implying that natural selection played a greater role over the mutational pressure in determining the CUB of mt genes in two fish species.
In a study on molecular phylogeny of five mitochondrial genomes of silkworm, Abdoli et al. (2022) found the RC value < 0.5 of GC12 on GC3 in all the 13 mitochondrial genes. This indicated that natural selection dominated over mutational pressure in shaping their CUB (Abdoli et al., 2022).
3.8. Nucleotide skewness
In this experiment, the average AT skew value for Channa striata was − 0.065, and the average GC skew value was − 0.308, whereas in Channa punctata average AT skew was 0.092 and average GC skew was − 0.33. The coding sequences of both mt genomes clearly showed that A and G nucleotides were used less frequently than T and C (Wei et al., 2014a). Previous research on CUB suggested that nucleotide skewness played an important role in determining the CUB of genes or genomes (Deb et al., 2018). We calculated the correlation of ENC with nucleotide skews in Channa striata and discovered a positive correlation of ENC with AT skew (0.002), purine (PU) skew (0.213), and keto skew (0.426), but a negative correlation with GC skew (-0.220), pyrimidine (PY) skew (-0.094), and amino skew (-0.191). On the other hand, in Channa punctata we found positive correlation of purine skew (0.212), keto skew (0.434), but negative correlation of GC skew (-0.237), AT skew (-0.001), py skew (-0.148), and amino skew (-0.251) with ENC. From these findings we could conclude that nucleotide skewness might have affected the CUB of mt genes in two fish species.
Deb et al. (2020) discovered more usage of C over T; and G over A base in their study of hepadnavirus. From the correlation analysis of ENC with nucleotide skews, they found a negative relationship between AT skew and CUB, GC skew, PY skew, PU-PY skew, PU skew and amino skew; while a positive correlation was observed with the keto skew (Deb et al., 2020). Butt et al (2016) recorded higher use of A base over G; and C base over T in skew analysis in their experiment with retroviral genomes (Butt et al., 2016). Chakraborty et al. (2019) found a significant correlation of codon usage with nucleotide skews in Nipah virus genes (Chakraborty et al., 2019).
3.9. Role of translational selection (P2)
P2 values are used to determine whether or not the genes are influenced by translational selection. We evaluated the mean P2 value of the genes in Channa striata (0.40) and Channa punctata (0.47). Mean P2 value less than 0.5 indicated that translational selection was less important across two genomes (Chakraborty et al., 2019). Correlation analysis of the ENC and P2 values revealed a negative relationship (-0.207) in Channa striata and a positive correlation (0.017) in Channa punctata; implying a positive relationship between CUB and translational selection in Channa punctata (Deb et al., 2020).
Chakraborty et al. (2019) calculated the MRI and P2 values in their CUB study on Nipah virus, indicating a significant role of the translational selection and mutational pressure (Chakraborty et al., 2019). Deka et al. (2019) discovered a lower level of translational efficiency in the influenza A virus’s matrix protein genes (M1 and M2) (Deka et al., 2019). In their study of the latent-stage genes in Epstein-Barr virus, Karlin et al. (1990) reported that de-optimized codons compete less with the host's cell translation machinery (Karlin et al., 1990).
3.10. The mutation responsive index (MRI)
MRI is used to calculate the effects of translational and mutational selection on CUB (Chakraborty et al., 2019). Presence of mutational pressure is indicated when the MRI value is positive, whereas, the presence of translational selection is indicated if the MRI value is negative. Channa striata had a mean MRI value of 1.14 in our study, while Channa punctata had a mean MRI value of -1.10. It revealed that lateral mutation pressure acted in the mt-genome of Channa striata, while translational selection influenced the mt-genome of Channa punctata.
Chakraborty et al. (2017) in one of their study discovered that translational selection outperformed mutational pressure in four Bungarus species, with average MRI values of -2.58 (B. candidus), -8.74 (B. flaviceps), -0.19 (B. fasciatus), and − 1.44 (B. multicinctus) (Chakraborty et al., 2017). In contrast, Deb et al. (2020) in their study discovered an average MRI value of 0.47, which showed a strong effect of mutation pressure on the CUB of the hepadnavirus genome (Deb et al., 2020).
3.11. General hydropathicity value (GRAVY)
In our study, we found a positive GRAVY (Grand average hydropathy) score (0.725 in Channa striata and 0.735 in Channa punctata) of the encoded proteins in two mt-genomes which revealed that the mitochondrial proteins of the two fish species are all hydrophobic in nature. This could be due to the maintenance of the biological functions in two fish species (Abdoli et al., 2022).
Previous research also found that amino acids like cysteine, threonine and alanine are more abundant in the proteins encoded by ATP6 and ATP8 genes in mammals, fish and bird (Uddin et al., 2020).
3.12. Phylogenetic analysis of the mitochondrial genes of the two fish species
The results of the phylogenetic clustering of the mitochondrial genes of Channa striata and Channa punctata are presented in Fig. 7. In this cluster analysis we found that each of the genes falls in the same cluster. So we could conclude from this clustering analysis that any particular mt gene in both the fishes might have originated from the same ancestral gene.