Genome size greatly varied among angiosperms, from 130 Mb (Arabidopsis thaliana) to 127 Gb (Fritillaria assyriaca) (Bennett and Smith 1976; Bennett et al. 1982), it was related with evolutionary processing, such as chromosome polyploidy and rearrangement (Soltis et al. 2003).
In this study, the estimated genome size of L. ruthenicum was 3,297. 65 Mb, which was nearly similar to the genome size of pepper (~ 3.48 Gb) (Cheng et al. 2014; Seungill et al. 2014) and nicotiana (~ 3.1 Gb) in Solanaceae family (Bombarely et al. 2012; Nicolas et al. 2013; Sierro et al. 2014). But it was about three times of tomato (~ 0.95 Gb) (Anthony et al. 2014; The Tomato Genome Consortium,2012), potato (~ 0.84G) (Potato Genome Sequencing Consortium 2011), and eggplant (~ 0.83 Gb) (Hideki et al. 2014).
The chromosome number of L. ruthenicum is 2n = 24(Chen et al,2008), which is same to the most of Solaneceae species, such as cultivated tomato༈2n = 2x = 24, The Tomato Genome Consortium 2012), pepper༈2n = 2x = 24, Cheng et al. 2014; Seungill et al. 2014), eggplant (Solanum melongena L.) (2n = 2x = 24, Hideki et al. 2014), potato (2n = 4x = 48, Potato Genome Sequencing Consortium 2011), tobacco (2n = 4x = 48, Sierro et al. 2014), but different from Nicotiana benthamiana) (x = 19, Bombarely et al. 2012) and Petunia (x = 7, Conia et al. 2010). As above mentioned, the majorities of Solanaceae plants are diploid and have the same basic chromosome number, according to the Wu and Tanksley (2011), the genome sizes of Solaneceae plants are greatly changed, which may indicated that the large-scale genome replication and chromosome diploidization events did not occur in Solanaceae plants during the long evolution process. Therefore, we speculated that the L. ruthenicum shared same chromosome evolution event with most of Solaneceae species.
The high heterozygosity was the reason of fitness and ecological success (Vrijenhoek 1994), and was related with morphological and adaptive differentiation of species. The heterozygosity rate of L. ruthenicum is 1.13%, suggested that the structure of L. ruthenicum genome has great variation. We speculated that the high heterozygosity of L. ruthenicum has resulted from the long evolution and adaptation process. Therefore, given the high heterozygosity of the genome, it is not suitable for genome assembly based on the second-generation sequencing results, and it is recommended to use the third-generation sequencing technology with a longer reading length.
The proportion of repeat sequence distribution were gradually increased from bacteria to eukaryotes, the repeat sequence content of some model organisms was as follows; bacteria less than 1%, beer yeast 3.4%, arabidopsis 13–14%, Caenorhabditis elegans 16.5%, Drosophila melanoderma 33.7%, mouse 38%, human 50%, corn 77%) (Ai 2008).
The repeat rate of L. ruthenicum was estimated as 73.13%, which was higher than the proportion in potato (62.2%) (Potato Genome Sequencing Consortium 2011), but was lower than the proportion in pepper (81%) (Varshney et al. 2012). According to the Uozu (1997), the number of repetitive sequence contributed to the nuclear DNA content. Therefore, the different proportions of the repetitive elements caused the genome size variation of same Solanceae family.
Repetitive sequences played the important role in evolution process, they were expanded and enriched the genetic information (Eichler and Sankoff 2003), protect coding sequences (Cangiano and Volpe 1993), at the same time, they were physically determined the chromosome structure, and influenced transcriptional regulation, played an important role in genome differentiation during speciation (Tang 2011). Therefore, we believed that the high proportion of repetitive sequence was the result of long evolution process of L. ruthenicum, and had the contributor of relatively great genome size.
The trinucleotide repeat unit accounted for larger proportion, while the dinucleotide repeat unit occupied larger proportion in L. ruthenicum. At the same time, the dominant repeat motif was also different, the GTT/CAA, ACA, and ATC were dominant repeat motifs in L. babarum genome, while the motifs of AT/AT, AC/GT and AG/CT were the dominant repeat motifs in L. ruthenicum (Dang et al. 2016). The mutation of dinucleotide repeat could cause to genetic instability and thus generate genetic diversity (Hammock 2005; Oki et al 1999). Therefore, we speculated that the genome of L. ruthenicum instability than L. babarum. The SSR characters of genome are same to the transcriptome of L. ruthenicum in which dinucleotide account for larger proportion and AG/CT, AG/CT, AT/AT and AC/GT were the dominant repeat motifs (Hao et al. 2019).