Genome size and sequence data
The results obtained from flow cytometry revealed that Tinospora sagittata has a wide range in genome sizes (Table 1). The genome size ranged from 1.01 Gb in T. sagittata var. craveniana to 6.06 Gb in T. sagittata var. yunnanensis (Table 1). Although flow cytometry results were not supplemented by conventional chromosome counts to confirm the ploidy level, we found a nearly sixfold change in genome size. In addition, Alami et al [26] revealed a monoploid genome size of T. sagittata approximately 553.23 Mb, the assembled tetraploid genome size was 2.33Gb. Therefore, T. sagittata var. yunnanensis genome size based on flow cytometry is six times larger than T. sagittata var. craveniana and is recognized as an hexaploid, but T. sagittata var. craveniana is recognized as diploid.
Of the two generated Illumina data, we generated PE clean reads (average read length 150 bp) ranging from 115.88 Mbp reads for Tinospora sagittata var. craveniana to 135.06 Mbp reads for T. sagittata var. yunnanensis (Table 1). The next generation sequencing total reads were 17.35 Gb and 20.20 Gb (Table 1). For hexaploid plant T. sagittata var. yunnanensis, Pacbio HiFi generated a total base of 25.00 Gb (average read length 15,745 bp), while the diploid T. sagittata var. craveniana with 13.40 Gb base (average read length 14,584 bp) (Table 1). The average coverage of T. sagittata var. craveniana with 17.52× short reads and 13.53× long reads, and 3.33× short read and 4.13× long reads for T. sagittata var. yunnanensis.
Characteristics of organelle genomes
The complete mitogenomes of Tinospora sagittata var. craveniana and T. sagittata var. yunnanensis were obtained by combing Illumina and PacBio reads and visualized using Bandage (Fig. S1a, b). The assembly circle mitogenome with a total length of 513,210–513,215 bp and a GC content of 48% (Fig. 1a, b). A total of 70 genes were annotated, including 38 protein-coding genes (PCGs), 6 rRNAs, and 26 tRNAs, and identified in the mitogenome of T. sagittata var. craveniana and T. sagittata var. yunnanensis (Fig. 1a, b; Table 2). It was found that only the rps19 protein-coding gene and had two copies (Fig. 1a, b; Table 2). Furthermore, three rRNA (rrn5, rrn18, rrn26) and four tRNA (trnF-GAA, trnfM-CAU, trnS-GCU, trnW-CCA) genes also has two copies (Fig. 1a, b; Table 2). Eight PCGs and one tRNA contained at least one intron (Fig. 1a, b; Table 2).
The chloroplast genomes displayed the typical quadripartite structure, consisting of a pair of Inverted Repeats (termed IRA and IRB), separated by the Large Single Copy regions (LSC) and Small Single Copy regions (SSC) (Fig. 1c). Besides, the plastomes of the different Tinospora species showed little variation in size (163,621–164,006 bp) and GC content (38%) (Fig. 1c; Table S1). We observed only marginal variation in the LSC (92,488–93,027 bp), SSC (20,203–20,909 bp), and IR (25,112–25,388bp) (Table S1). As well, the GC content in the LSC, SSC, and IR regions were 35.50–35.70%, 32.00–32.40%, and 43.40–43.50% (Table 3), respectively. The plastomes of all two accessions, encoded an identical set of 130 genes with 16 being duplicated in the IR regions. Among the 130 genes, there were 85 PCGs, 37 tRNA genes, and eight rRNA genes (Fig. 1c). It is worth mentioning that the 5-end exon of the rps12 gene was located in the LSC region, and 3-end exon of the gene were situated in the IR region. Besides, six tRNA genes (trnA-UGC, trnG-UCC, trnI-GAU, trnK-UUU, trnL-UAA, and trnV-UAC) and nine protein-coding genes (atpF, ndhA, ndhB, petB, petD, rpl2, rpl16, rpoC1, and rps16) contained a single intron, and three genes including rps12, clpP, and ycf3 contained two introns (Fig. 1c).
Collinearity and repeat sequences
To explore the rearrangements and conserved sequence blocks within the mitogenomes, we employed the Mauve to analyze the mitogenome collinear regions (Fig. S1c). When comparing the two mitogenomes (Tinospora sagittata var. craveniana and T. sagittata var. yunnanensis), we identified five locally collinear blocks (LCBs) (Fig. S1c). We further found a translocation ca.11kb (including nad1 and cemB genes), as well as two inversions from trnW-CCA to mttB (ca. 73 kb) and from trnH-GUG to rrn5 (ca. 82 kb), respectively (Figs. 1a, b, S1c).
We have identified amount of SSRs in the plastomes and mitogenomes (Fig. 2a, TableS2). Within the plastome, mononucleotide repeats (163 out of 287 in Tinospora sagittata var. craveniana;169 out of 272 in T. sagittata var. yunnanensis) and dinucleotide repeats (85 out of 287 in T. sagittata var. craveniana; 90 out of 272 in T. sagittata var. yunnanensis) being more abundant, other repeats being less abundant (Fig. 2a, TableS2). The majority of the mitogenome SSR of T. sagittata var. yunnanensis were mononucleotide repeats (140 out of 442) and dinucleotide repeats (199 out of 442), followed by tetranucleotide repeats (69), and trinucleotide repeats (28). However, the majority of the mitogenome SSR of T. sagittata var. craveniana was dinucleotide repeats (202 out of 330), followed by tetranucleotide repeats (67), trinucleotide repeats (30), and mononucleotide repeats (24) (Fig. 2a, TableS2). It was obviously that both hexanucleotide repeats and pentanucleotide repeats was the least in plastomes and mitogenomes (Fig. 2a, TableS2). REPuter identified 50 dispersed repeats in each organelle genomes (Fig. 2b, TableS3). When dispersed repeats in the plastomes are further divided into four types, including 26–29 forward repeats, 8–12 reverse repeats, 10–12 palindromic repeats, and 1–2 complement repeats (Fig. 2b, TableS3). Additionally, no reverse (R) or complement (C) repeats were detected in the mitogenome (Fig. 2b, TableS3). Among the dispersed repeats, there are 20–222 forward repeats and 28–30 palindromic repeats in the mitogenome (Fig. 2b, TableS3). The A/T type mononucleotide was the most abundant SSR in the plastomes, while AG/AT type was the most abundant SSR in the mitogenomes (Fig. 2c, TableS4).
Protein coding gene codon usage and comparative analysis of genomic variation
There was significant difference in the codon bias of the genes of Tinospora organelle genomes (Fig. 3a, TableS5). In general, the two organelles genomes share the same most frequent codon for each amino acid (Fig. 3a, TableS5). It was found that there were 28 codons (UCC, UUU, AAA, UCA, GUU, GUA, CCA, UGU, UUG, CCU, CGU, GGU, UUA, UAA, CGA, AUU, GAA, ACU, AAU, CUU, UCU, AGA, GAU, GGA, CAA, CAU, UAU, GCU) with RSCU > 1 in both plastomes and mitogenomes, of which 15 end with U and 11 end with A (Fig. 3a, TableS5). Two codons RSCU value of 1 in both plastomes and mitogenomes, which were AUG and UGG encoding methionine (Met) and tryptophan (Trp) (Fig. 3a, TableS5). It worth mentioned that two codons (ACC and UGA) RSCU > 1 in mitogenomes, but the two codons RSCU < 1 in plastomes (Fig. 3a, TableS5).
Comparative analysis of nucleotide polymorphisms of mitogenomes and plastomes, the variability in plastomes were higher than that mitogenomes (Fig. 3b, c). The Pi values ranged from 0 to 0.40 in plastome and 0 to 0.02 in mitogenomes (Fig. 3b, c). Among mitogenomes, there were five hypervariable regions in plastomes with Pi > 0.016 were trnS–rpl10, rps3_exon2–rps3_exon1, trnP–nad5_exon5, ccmFN–rps2, nad4_exon3–nad4_exon4 (Fig. 3b). Compare with mitogenomes, the plastome generally with lower Pi values in IR regions, and four (trnV_exon2–ndhC, trnL_exon1–trnT, trnG–psbZ, trnK_exon2–psbA) were common high variation hotspots in the LSC and SSC (Fig. 3c).
Selective pressure analysis of organelle genome CDS
The evolutionary pressure among the protein coding sequences of two organelle genomes were analyzed, the Ka/Ks value of protein coding sequences (CDS) were calculated (Fig. 4). The results showed that the Ka values of both plastome and mitogenomes were almost all less than 0.3 (Fig. 4a). The Ks values of both organelle genomes and Ka/Ks of plastome were ranged from 0 to 1.0 (Fig. 4b, c). There was no significant difference in Ka values between the two different organelle genomes (Fig. 4a), but there was significant difference in Ka values and Ka/Ks values between two different organelle genomes (Fig. 4b, c). In addition, there is no significant difference in Ka values, Ks values, and Ka/Ks values among the same organelle genomes of different species (Fig. 4).
Phylogenetic analysis of the mitogenome
In our phylogenetic inference, 108 species representing 18 order of Angiosperm were included (Table S2; Fig. 5). After carefully alignment check and avoid long branch attraction influence, the combined dataset was 32,927 nucleotides in length and ModelFinder identified GTR + F + I + G4 as the best model of evolution for both Maximum-Likelihood (ML) and Bayesian Inference (BI) analyses (Table S3). ML and BI analyses produced trees sharing the same general topology, including 18 main clades each supported by high statistical values (Fig. 5). At the order level, the overall phylogenetic tree closely aligns with the APG IV system (Fig. 5). Amborellales as the first divergence lineage, the remaining order formed two super clades (Fig. 5). On of the super clade was consisted of Alismatales, Pandanales, Asparagales, Arecales, and Poales (Fig. 5). Another super clade was consisted of Asterales, Solanales, Lamiales, Ericales, Fabales, Fagales, Rosales, Brassicales, Myrtales, Malpighiales, Proteales, and Ranunculales. In addition, the two species of Ranunculaceae formed a well-supported monophyletic clade and sister to the newly sequenced species of Menispermaceae with full support (ML-BS = 100, BI-PP = 1.0; Fig. 5).