Chloroplast Genome Features of L. japonica cv. Damaohua. The chloroplast genome sequence of L. japonica cv. Damaohua was found to be 155,151 bp in length, with a typical tetragonal structure consisting of an LSC region (88,924 bp), an SSC region (18,649 bp) and two IRs (IRA and IRB, 23,789 bp) (Fig. 1). The chloroplast genome of L. japonica cv. Damaohua has 127 functional genes, including 80 protein-coding genes, 8 ribosomal RNA (rRNA) genes and 39 transfer RNA (tRNA) genes. The overall nucleotide composition of L. japonica cv. Damaohua was 30.2% A, 31.2% T, 19.6% C and 19.0% G, and the total GC content was 38.6% (Table 1). A total of 16 genes were found to be repeated in the IR region, including 10 protein-coding genes (atpF, ndhA, ndhB, rps12, rps16, rps18, rpl2, rpoC1, ycf2, and ycf3) and 6 tRNA genes (trnA-UGC, trnG-GCC, trnI-GAU, trnK-UUU, trnL-UAA, and trnV-UAC). Gene structure analysis revealed that 16 genes contained introns, of which 15 (9 protein-coding genes and 6 tRNA genes) had one intron, while only one (ycf3) had two introns. The intron of trnK-UUU was the longest, while the intron of rps12 was the shortest (Table 2 and Table S1).
We assessed the basic characteristics of the chloroplast genomes of Loniceraceae and compared them with those of the L. japonica cv. Damaohua chloroplast genome. The L. japonica cv. Damaohua genome was slightly larger than the L. japonica (155,078 bp), L. insularis (155,124 bp) and L. macranthoides (154,897 bp) genomes but smaller than the L. maackii (155,318 bp) and L. confusa (155,346 bp) genomes (Table 3). Among the Loniceraceae chloroplast genomes, L. macranthoides, L. maackii, L. japonica and L. insularis were found to have the most chloroplast genes (131), followed by L. confusa (129) and L. japonica cv. Damaohua, which had the least (127). In addition, the GC contents of the chloroplast genomes of the six Lonicera species were similar, ranging from 38.3–38.6%.
Characterization of SSRs and Repeat Sequences. A total of 54 SSRs were detected in the chloroplast genome of L. japonica cv. Damaohua, of which 36 were mononucleotide, 4 were dinucleotide, 2 were trinucleotide, 9 were tetranucleotide and 3 were hexanucleoside repeats, with pentanucleotide deletions (Table 4). In addition, we compared the SSR distribution pattern and number of L. japonica cv. Damaohua with those of the other five chloroplast genomes in the Loniceraceae family (Table S2 and Table S3). Among the six Lonicera species, there were more mononucleotide repeats than all other types combined, and most SSRs were composed of mononucleotide and tetranucleotide repeats. The numbers and types of chloroplast SSRs varied in different species. Among the species, L. japonica cv. Damaohua, L. confusa, L. maackii and L. insularis all lacked pentanucleotide repeats, while L. macranthoides lacked hexanucleotide repeats. L. japonica cv. Damaohua (54 SSRs) and L. confusa (54 SSRs) had the most SSRs, while L. japonica (47 SSRs) had the fewest SSRs. In addition, the chloroplast genomes of L. insularis, L. macranthoides and L. maackii contained 52, 51 and 48 SSRs, respectively. The main type of SSR was a mononucleotide repeat, and most of the mononucleotide repeats were A/T-type SSRs.
A total of 89 large repeats were identified in the chloroplast genome of L. japonica cv. Damaohua using REPuter, including 68 forward repeats and 21 palindrome repeats (Fig. 2). Among them, the largest repeat was a forward repeat with a size of 83 bp. The repetitive sequences of the chloroplast genomes of six Lonicera species were compared and analysed. L. macranthoides had palindrome, forward, and reverse repeats, while L. japonica cv. Damaohua, L. confusa, L. japonica, L. maackii and L. insularis all had palindrome and forward repeats. L. japonica cv. Damaohua and L. maackii had the most forward repeats (68), while L. insularis had the fewest (49). L. insularis had the most palindrome repeats (40), while L. japonica cv. Damaohua and L. maackii had the same number of palindrome repeats (21 each) (Table S4).
Codon Usage Analysis.The chloroplast genome of L. japonica cv. Damaohua consisted of 51,717 analysed codons. Among these codons, the most used was UUU (2132), encoding Phe, and the least used was CGC (254), encoding Arg. There were 31 codons with relative synonymous codon usage (RSCU) values greater than 1 that all ended in A/U, indicating codon usage bias (Fig. 3). The RSCU values in the chloroplast genomes of the six species of Lonicera were slightly different. The six species of Loniceraceae all had the highest frequency of UUU (encoding Phe) in the genome. L. japonica and L. insularis had the lowest frequency of GCG (encoding Ala), and L. japonica cv. Damaohua, L. macranthoides, L. confusa. The least used codon in the L. maackii genome was CGC (encoding Arg). Except for Met, all amino acids are encoded by multiple codons. Leu, Arg and Ser have 6 synonymous codons; Ala, Gly, Pro, Thr and Val have 4 synonymous codons; Ile, Trp and stop codons have 3 synonymous codons; and Cys, Asp, Glu, Phe, His, Lys, Asn and Gln all have 2 synonymous codons. In the codon usage bias analysis, L. insularis had 35 codons with RSCU values greater than 1, and L. japonica cv. Damaohua, L. confusa, L. japonica and L. maackii had 33 codons with RSCU values greater than 1. L. macranthoides had 32 codons with RSCU values greater than 1, and most of the synonymous codons ended in A and U.
Fifty potential RNA editing sites were found in the chloroplast genome of L. japonica cv. Damaohua. The ndhD gene contained the most RNA editing sites (10), followed by ndhB with 8 editing sites; rpoB with 7 editing sites; matK with 4 editing sites; and ndhA, petB, rpl2, and rps2 with 2 editing sites. The following genes had 1 editing site (the lowest number): atpA, atpF, atpI, ccsA, clpP, ndhF, ndhG, psaI, psbE, psbF, rpl20, rpoC2, and rps8. Among the 50 potential RNA editing sites, 10 were observed at the first position of the codon, and 40 were observed at the second position. No potential RNA editing sites were found in the third position, and the base conversion types were all C-to-T. This result is similar to the case in other land plants. The amino acid conversion from Ser to Leu was the most frequent, while the conversion from Leu to Phe was the least frequent.
The RNA editing sites of the chloroplast genomes of the six species of Lonicera were different. Among the species, L. insularis had the most RNA editing sites with 61, followed by L. maackii with 59; L. japonica had the fewest editing sites with 48. The accD gene was a potential RNA editing site that was not detected in L. japonica cv. Damaohua, L. vesicaria or L. macranthoides, and the clpP gene was a potential RNA editing site not detected in L. maackii or L. confusa. The rpoA gene was a potential RNA editing site not detected in L. japonica cv. Damaohua, L. macranthoides or L. confusa, and rpoC1 was the only potential RNA editing site detected in L. confusa. There were 10 genes with no potential RNA editing sites detected, including atpB, petD, petG, petL, psaB, psbB, rpl23, ycf14, ycf16 and ycf3. Among the 6 potential RNA editing sites in Loniceraceae, most were located in the second position of the codon; no potential RNA editing sites were found in the third position. The base conversion types were all from C to T, and Ser-to-Leu was the most frequent amino acid conversion.
Comparison of Complete Chloroplast Genomes among Lonicera Species. To characterize the genomic differences, we used the program mVISTA to align the sequences of the six Lonicera species and used the annotations of L. insularis as a reference. The comparison showed that the chloroplast genomes of the six Lonicera species were highly similar, with very few differences (Fig. 4). The IR region exhibited fewer differences than the LSC and SSC regions. In addition, the divergent coding regions were smaller than the noncoding regions. Among the coding genes were genes in highly conserved regions, including matK, rpoC2, rpoB, ycf1 and ycf2. In addition, some highly differentiated regions, such as the rpoB-petN, rbcL-accD and psaA-ycf3 intergenic regions, were identified.
Expansion and Contraction of IR Region. The IR-LSC and IR-SSC boundaries of the chloroplast genome of L. japonica were compared with those of five reported Lonicera species (Fig. 5). The chloroplast genomes of the six Lonicera species were relatively conserved, and the six boundaries of L. japonica cv. Damaohua and L. japonica were similar. The chloroplast genomes of the six Lonicera species were located in the coding region of rpl23 at the LSC/IRa junction. The IRb/SSC connection between IRb and the SSC region (JSB) was located between the ycf2 and ndhF genes in 4 species (L. insularis, L. maackii, L. confusa and L. macranthoides), while it was located in the coding area of ndhF in L. japonica cv. Damaohua and L. japonica. The ycf1 gene was located on the IRa/SSC boundary in the six Lonicera species, but the length of the IRa/SSC junction of ycf1 in the SSC and IRa regions was different (L. insularis: 231 bp; L. maackii: 261 bp; L. japonica cv. Damaohua: 220 bp; L. confusa: 195 bp; L. macranthoides: 196 bp; L. japonica: 220 bp).
Adaptive Evolution Analysis. TBtools was used to calculate the synonymous (Ks) and nonsynonymous (Ka) substitution rates and Ka/Ks ratios in the chloroplast genomes of the six Lonicera species to detect whether the 74 shared protein-coding genes were under selection pressure (Fig. 6). The results showed that the majority of genes had Ka/Ks ratios < 1, indicating that the chloroplast genes of Loniceraceae species have been subjected to purifying selection during the long-term evolution process (Table S7). A total of 10 positively selected genes (Ka/Ks > 1) were detected in this study. Among them, the infA gene was positively selected in L. japonica cv. Damaohua vs. L. insularis; the matK gene was positively selected in L. japonica cv. Damaohua vs. L. macranthoides; the petB gene was positively selected in L. japonica cv. Damaohua vs. L. confusa; the petD gene was positively selected in L. japonica cv. Damaohua vs. L. insularis and L. japonica cv. Damaohua vs. L. maackii; the rbcL gene was positively selected in L. japonica cv. Damaohua vs. L. macranthoides; the rpl16 gene was positively selected in L. japonica cv. Damaohua vs. L. confusa; the rpl2 gene was positively selected in L. japonica cv. Damaohua vs. L. confusa; the rps3 gene was positively selected in L. japonica cv. Damaohua vs. L. insularis and L. japonica cv. Damaohua vs. L. maackii; the ycf1 gene was positively selected in L. japonica cv. Damaohua vs. L. japonica; and the ycf2 gene was positively selected in L. japonica cv. Damaohua vs. L. japonica, L. japonica cv. Damaohua vs. L. macranthoides and L. japonica cv. Damaohua vs. L. maackii. This indicates that these genes may have undergone positive selection in the process of evolution.
Phylogenetic Analysis. To clarify the phylogenetic positions and evolutionary relationships in Caprifoliaceae, the whole chloroplast genome sequences of 14 species of Caprifoliaceae were selected, and Chrysanthemum boreale was used as the outgroup to construct a maximum likelihood (ML) phylogenetic tree (Fig. 7). The chloroplast genome sequences of the 15 plant species were divided into 2 groups. The outgroup C. boreale was a separate group. Triosteum pinnatifidum and 13 Lonicera species were grouped together. In Caprifoliaceae, Triosteum and Lonicera were divided into 2 subgroups. In Lonicera, the chloroplast genome sequence of L. japonica cv. Damaohua had the closest relationship with that of L. japonica, followed by those of L. confusa and L. macranthoides. The three phylogenetic trees of these 14 chloroplast genome sequences were consistent with traditional taxonomy, indicating that the chloroplast genome can be used to effectively analyse the phylogenetic positions and relationships of species.