Mitogenome structure and organization
The mitogenome (17,164 bp) of the endangered Nicobar treeshrew, T. nicobarica was determined in the present study (GenBank accession no. MW751815). The mitogenome contained 37 genes, comprising 13 PCGs, 22 tRNAs, 2 rRNAs, and a major non-coding CR. Among them, nine genes (nad6 and eight tRNAs) were placed on the negative strand, while the remaining 28 genes were placed on the positive strand (Table 1, Fig. 1). In the order Scandentia, the length of the Tupaiidae mitogenome varied from 16,183 bp (T. montana) to 17,164 bp (T. nicobarica). All Tupaiidae species showed the same gene arrangement as observed in typical vertebrate’s mitogenome48. The nucleotide composition of the T. nicobarica mitogenome was A+T biased (58.3%), as in all Tupaiidae species ranging from 58.3% (T. nicobarica) to 59.72% (T. tana) (Table 2). The AT skew and GC skew were 0.11 and -0.30 in the mitogenome of T. nicobarica. The comparative analysis showed that the AT skew ranged from 0.08 (T. minor and T. tana) to 0.11 (T. nicobarica) and the GC skew from -0.28 (T. minor, T. montana, T. splendidula, T. tana) to -0.30 (T. nicobarica) (Table 2). A total of 14 overlapping regions with a total length of 87 bp were identified in T. nicobarica mitogenome. The longest overlapping region (43 bp) was observed between the ATP synthase F0 subunit 8 (atp8) and ATP synthase F0 subunit 6 (atp6). Further, a total of 14 intergenic spacer regions with a total length of 68 bp were observed in T. nicobarica mitogenome with the longest region (33 bp) between tRNA-Asparagine (trnN) and tRNA-Cysteine (trnC) (Supplementary Table S3).
Protein-coding genes
The total length of PCGs was 11,410 bp in T. nicobarica, which represents 66.47% of the complete mitogenome. The nucleotide composition of the T. nicobarica PCGs was A+T biased (57.95%), as in all Tupaiidae species ranging from 57.95% (T. nicobarica) to 59.61% (T. tana) (Table 2). The AT skew and GC skew were 0.09 and -0.36 in the PCGs of T. nicobarica (Table 2). Most of the PCGs of T. nicobarica initiated with an ATG start codon; however, the ATC initiation codon was found in the NADH dehydrogenase subunit 2 (nad2), ATT in NADH dehydrogenase subunit 3 (nad3), ATA in NADH dehydrogenase subunit 5 (nad5). The TAG termination codon was used by six PCGs, TAA by four PCGs, AGA by Cytochrome oxidase subunit 1 (cox1), AGG by NADH dehydrogenase subunit 6 (nad6), and incomplete T(AA) by NADH dehydrogenase subunit 4 (nad4) respectively. The comparative study revealed that, most of the PCGs in other Tupaiidae species were initiated by ATG start codon and terminated by TAA stop codon (Supplementary Table S4). The analysis of mitogenome for detecting positive selection of PCGs assists to understand the influences of natural selection in evolution and protein function49,50. The comparison of synonymous (Ks) and nonsynonymous (Ka) substitution rates in PCGs, witnessed for Darwinian selection and adaptive molecular evolution51,52. It is reported that, for positive selection Ka/Ks > 1, for neutrality Ka/Ks = 1, and for negative selection Ka/Ks < 153. This approach has the benefit to reveal the natural selection acting on PCGs. Thus, to investigate the evolutionary rates between homologous gene pairs, Ka/Ks substitutions were calculated and compared with six Tupaiidae species. The average Ka/Ks values of 13 PCGs varied from 0.006 (cox1) to 0.153 (atp8) and resulted in the following order: cox1<cox3<cox2<atp6<nad3<cytb<nad1<nad4<nad6<nad4l<nad5<nad2<atp8 (Supplementary Table S5, Supplementary Fig. S1). Most of the PCGs show Ka/Ks values of <1, which indicated a strong negative selection among the studied Tupaiidae species, that reflects natural selection works against deleterious mutations with negative selective coefficients as highlighted general patterns in other vertebrates. The comparative RSCU analysis indicated a significant fall in the frequency of GCG codon in Alanine (Ala) was observed in T. nicobarica, T. montana, T. minor, T. splendidula, and T. tana, except in T. belangeri with CCG in Proline (Pro) (Supplementary Fig. S2).
Ribosomal RNA and transfer RNA genes
The total length of two rRNA genes of T. nicobarica was 2,519 bp, compared to a range from 2,508 bp (T. montana) to 2,520 bp (T. belangeri) among other Tupaiidae species in the present dataset. The AT content within rRNA genes was 58.56%, while the AT and GC skew were 0.22 and -0.09 respectively observed in T. nicobarica rRNAs (Table 2). A total of 22 tRNAs were found in the T. nicobarica mitogenome with a total length of 1,497 bp. In other Tupaiidae species, the length of tRNAs varied from 1,493 bp (T. minor) to 1,564 bp (T. belangeri). The AT content within tRNA genes was 60.86%, while the AT and GC skew were 0.11 and -0.12, respectively observed in T. nicobarica tRNAs (Table 2). Most of the tRNA genes were predicted to be folded into classical cloverleaf structures, except trnS1 (without DHU stem and loop) and trnK (without DHU loop) (Supplementary Fig. S3). The conventional pairings (A=T and G≡C) were observed in most of the tRNAs bases54; however, wobble base pairing was observed in the stem of 14 tRNAs (trnA, trnN, trnQ, trnE, trnC, trnG, trnL1, trnK, trnL2, trnP, trnS2, trnT, trnY, and trnW) (Supplementary Fig. S3). The wobble base pairing is a key feature of RNA structure and often substitutes the conventional base pairs due to thermodynamic stability. These characteristics play crucial functional roles in a wide range of phenomena55. Thus, the comparisons of tRNAs secondary structures are crucial for inferring the structural and functional features of the mitogenomes56.
Control regions
The CR of T. nicobarica was typically distributed with three functional domains: extended termination associated sequences (ETAS), central domain (CD), and the conserved sequence block (CSB), as observed in other mammalian mitochondrial CRs25,57. Although, the ETAS and CSB domains contain varying numbers of tandem repeats, the CD domain consists with highly conserved sequences. Hence, the pattern of CR was varied among different mammals, including Tupaiidae (Scandentia). The total length of T. nicobarica CR was 1,757 bp, compared to a range of 778 bp (T. splendidula) to 1,757 bp (T. nicobarica) in the present dataset. In the T. nicobarica CR, the AT and GC skew was 0.12 and -0.30 (Table 2). The CR is also involved in the initiation of replication and is positioned between trnP and trnF for most of the Tupaiidae including T. nicobarica. The ETAS domain was divided into two regions: ETAS1 (60 bp) and ETAS2 (67 bp), while the CSB domain was further divided into three regions: CSB1 (25 bp), CSB2 (17 bp), and CSB3 (18 bp). After CSB3, a six base pair (CGTACA) tandem repeats were found 60.3 times in T. nicobarica, while eight base pair (CACACATA) were found 23.8 times in T. belangeri (Fig. 2). Due to the short nucleotide length, no tandem repeats were found in other Tupaiidae species CRs. The structural features of CR play an important function in influencing transcription and replication in the mitochondrial genome58,59. The present study evaluated the genetic features of CR among the studied Tupaiidae species mitogenomes including T. nicobarica that will be helpful to speculate the evolutionary pattern of this group.
Phylogenetic relationship
The phylogenetic position of Scandentia is repetitively examined within the eutherian tree. The treeshrews are widely considered as living fossils due to their approximating ancestral lineages with primates60. Based on the anatomical evidence, Primates, Chiroptera, Dermoptera, and Scandentia were hypothetically within the superordinal clade Archonta without considering the paleontological or molecular evidence61,62. Later on, the phylogenetic position of Scandentia has been studied based on the complete mitochondrial DNA sequences of wider group of taxa and corroborated a closer relationship with Lagomorpha24,25,63. Further, multiple loci of mitochondrial genes has been assessed to check the phylogeny of treeshrews and diversification and the timescale of diversification in Southeast Asia15,27,64,65. The present ML and BA phylogeny clearly discriminate T. nicobarica from other congeners by complete mitochondrial genome and congruent with earlier evolutionary hypotheses (Fig. 3, Supplementary Fig. S4). Further, using four calibration points from earlier studies, the present mitogenome-based dating analysis indicates that, the Tupaiidae species (Scandentia) were diverged from Primates and Dermoptera during the Cretaceous period (90 to 104.9 MYA). The studied Tupaiidae species, including T. nicobarica were diverged during the Pliocene to Miocene epoch (2.8 to 20.8 MYR). The endemic Nicobar treeshrew, T. nicobarica was diverged from the common ancestor lineages of other Tupaiidae species during the Miocene epoch (11.4 to 18.8 MYR) (Fig. 3). Further, based on 16S rRNA genes (1667 bp), we evaluated the status of two known subspecies of T. nicobarica from the Great and Little Nicobar Islands. The T. nicobarica nicobarica and T. nicobarica surda showed cohesive clustering in the BA tree as compared with other species (Fig. 4). Both the subspecies depicted 11 variable sites and maintained less genetic distance (0.7%) with each other. The 16S rRNA based topology showed a sister relationship of T. nicobarica with T. javanica, distributed in Sumatra and Java.
Biogeographic connection and conservation implication
Due to the volcanic eruption in Miocene–Pliocene and subsequent irregular land bridges during the Pleistocene, several small to large island ecosystems were built up in South and Southeast Asia. The South and Southeast Asian mainland and islands have been formed by the convergence of three major crustal units of the earth, the Eurasian, Indo-Australian, and Pacific plates2. The tectonic drifts allowed multiple possibilities for animal dispersal and thus colonize many family members into the same or distant geographical distribution. Due to the adjacent biogeography realms, the biological connections between the Indian mainland and Southeast Asia were well documented66. However, the faunal diversity of the AN archipelago and their biotic networks is still anonymous in spectacular aspects. The bathymetric study evidenced that the well-developed seamounts have been detected on the Andaman seafloor, which extended up to Sumatra and Java Islands67,68. Considering the skeletal variation, the treeshrew species showed intraspecific variation depending upon their distribution in mainland and island ecosystems69. A recent molecular study also elucidates the biogeographic connection of smaller mammals in the AN archipelago with the Indo-Malayan and Sundaic realms3. The present mitogenome based phylogeny also manifested the close relationship of T. nicobarica with T. minor, T. tana, T. splendidula, and T. montana (distributed in Thailand, Peninsular and East Malaysia, Brunei Darussalam, Sumatra, and Indonesia) as compared with the widespread species T. belangeri known from South and Southeast Asia. Further, the single gene (16S rRNA) based phylogeny showed sister relationship of T. nicobarica with T. javanica (distributed in Sumatra and Java) as compared with other congeners.
Considering the conservation implication, the previous studies reported that, this arboreal mammal species confronted several threats due to the forest loss and fragmentation, and ongoing road construction from Galathia to Indira Point at the Great Nicobar Island70. Although the species is listed on the IUCN with decreasing population trend, it has not yet listed in the Indian wildlife (Protection) Act, 1972. Other than a single ecology and behavior study and a nest record, no ample assessment has been approached so far12,71.
Besides, the treeshrews species were considered as a significant model for studying hepatitis and influenza H1N1 viral infections19,20. A recent study characterized the genome sequence to demonstrate the genetic basis of signaling pathways in nervous and immune systems of the Chinese treeshrew and evidenced as a potential model for biomedical research21. We propose the whole genome sequencing of T. nicobarica is essential for improving our ability to predict the signaling pathways linked with many pathogenic microorganisms as well as able to develop potential mitigations programs in advance. As the population of this treeshrew is confined to the insular habitats in the Nicobar Islands, we propose an integrated approach with taxonomy, ecology, and further molecular studies to save this endemic species before it reaches to the brink of extinction.