After Linnaeus creatively proposed the binomial nomenclature and laid the foundation for modern taxonomy, organisms have primarily been classified based on their phenotypic characteristics, such as the reproductive organs of plants. This classification method has long facilitated the recognition and differentiation of various organisms. This effective classification system has been used up until now(M. Liao et al., 2022). With the development of sequencing technologies, it has become quite common to construct phylogenetic trees using gene sequences, providing a more powerful tool for species classification and evolutionary research compared to phenotypic characteristics(Guo et al., 2023). Transcriptome data generated by second-generation sequencing technology has high quality and fully supports the analysis of gene sequence information, including the construction of evolutionary trees and SNP calling(Jehl et al., 2021; Smith et al., 2011). By utilizing existing reference genomes and transcriptome data, SNP calling can yield a large number of SNPs and INDELs that characterize the genomic features(F. Liu et al., 2019). These abundant molecular markers provide the most direct evidence for comparing genomes of different species(Cokus et al., 2015).
In the early years of the modern evolutionary synthesis (1924–1950), interspecific hybridization was generally considered to be rare and of little evolutionary significance, a view shaped by researchers focusing on animal systems with strong interspecific reproductive barriers(Yakimowski & Rieseberg, 2014). However, ever since Linnaeus, botanists have been describing natural hybrids. Floristic surveys indicate that ~ 10% of plant species hybridize(Yakimowski & Rieseberg, 2014). Hybridization can potentially give rise to new species or may remain restricted to the F1 generation. Compared to the origin of polyploid hybrid species, the origin of homoploid hybrid species is considered to be relatively rare(Olave et al., 2022), and homoploid hybridization may also lead to the creation of new species, and it is also possible that they will stay in the F1 generation due to reproductive isolation and unadapted hybrid offspring. In recent years, the origin of diploid hybrids has garnered increasing attention. Liu et al.(B. Liu et al., 2014) investigated the possible origin of diploid hybrids by utilizing variations in nuclear and chloroplast DNA sequences, combined with approximate Bayesian computation (ABC) and ecological niche modeling. The need for identifying hybrid offspring is mainly focused on research in species conservation, germplasm management, and crop breeding(Cao, 2016). In the past, the identification of hybrids has mainly relied on phenotypic traits, where hybrids are identified by exhibiting both parental characteristics. However, this method has its limitations, such as the limited availability of phenotype characteristics and the inability to differentiate F1 hybrids from backcross individuals. The development of isozyme molecular markers later provided a new approach for hybrid identification. By determining whether two different allozymes are produced at a specific gene locus, it is possible to determine whether that locus is heterozygous(Harris, 1997). When identifying the hybrid F1 generation, If loci with fixed differences exist, F1 hybrids will be heterozygous for all such loci(Chakraborty & Rannala, 2023).
So far, transcriptome data has primarily been used for differential gene expression analysis, and in a few cases, researchers have also utilized it for SNP calling and other analyses. However, no one has yet used transcriptome data for identifying hybrid F1 generations. Here, we are the first to use transcriptome data for identifying hybrid F1 generations and have also observed the challenges that may arise when performing SNP calling on heterozygous loci in transcriptome data.
Sea buckthorn is a deciduous shrub or tree belonging to the genus Hippophae of Family Elaeagnaceae. Sea buckthorn not only has nutritious fruits, but is also a pioneer species in soil improvement, wind and sand control, and soil and water conservation. Therefore, sea buckthorn is a plant of significant ecological and economic value(Z. Wang et al., 2022). The Tibetan Plateau and surrounding regions such as the Himalayas and Hengduan Mountains are considered the original habitat of the Hippophae genus. It is generally believed that after its origin, sea buckthorn migrated and evolved in two directions: one towards the Loess Plateau and North China region in China, and the other through Central Asia to the European continent. During this migration process, different species and subspecies emerged due to their interaction with different landforms and climates(Hu, 2021). According to the plant classification system of Hippophae by Lian et al., it has been discovered that there are a total of 6 species and 17 subspecies in the genus (Table 1). Among them, China's Tibetan Plateau and surrounding regions, including Xinjiang, Gansu, Sichuan, Yunnan, etc., are home to 6 species and 13 subspecies. The remaining 4 subspecies of sea buckthorn are distributed in Europe: H. rhamnoides subsp. rhamnoides, H. subsp. fluviatilis, H. rhamnoides subsp. carpatica, and H. rhamnoides subsp. caucasia found in the Eurasian border region.
H. goniocarpa found in Rixu Village, Qinghai Province, China, has been proved to be a hybrid offspring of H. rhamnoides subsp. sinensis as mother and H. neurocarpa, and is suspected to be a case of potential homoploid hybridization of origin(A. Wang et al., 2008). Additionally, in the Tibet region of China, a hybrid descendant that appears to be a cross between the H. neurocarpa. and H. tibetana. has been found.
Here, we identified the hybrid F1 generation through transcriptome data and proposed for the first time the limitations of heterozygous sites in SNP calling from transcriptome data. Subsequently, we constructed a relatively reliable phylogenomic tree that describes the evolutionary relationships of the six known sea buckthorn species using sea buckthorn transcriptome data. We further compared the genomes of seven sea buckthorn species (H. rhamnoides subsp. sinensis, H. rhamnoides subsp. mongolica, H. rhamnoides subsp. yunnanensis, H. tibetana, H. salicifolia, H. gyantsensis, and H. neurocarpa) using SNPs and INDELs.
Table 1
Systematic classification of seabuckthorn genus
Sect.1. Hippophae | Sect. 2. Gyantsenses Lian |
H. rhamnoides Linn. | H. goniocarpa Lian. X. L. Chen et K. Sun |
ssp. sinensis Rousi | ssp. litangensis Lian et. X. L. Chen |
ssp. wolongensis Y. S. Lian, K. Sun & X. L. Chen | ssp. goniocarpa |
ssp. yunnanensis Rousi | H. gyantsensis (Rousi) Lian |
ssp. turkestaniea Rousi | ssp. linearifolia |
ssp. mongolica Rousi | ssp. gyantsensis |
ssp. caucasia Rousi | H. neurocarpa S. W. Liu et T. N. He |
ssp. carpatica Rousi | ssp. neurocarpa |
ssp. rhamnoides | ssp. Stellatopilosa Lian et X. L. Chen |
ssp. fluviatilis Van Soest | H. tibetana Schlecht |
H. salicifolia D.Don | ssp. yadongensis |
| ssp. tibetana |