Miscanthus, which is a high-biomass-yielding perennial C4 grass, has emerged as a candidate second-generation energy crop with great potential utility [1–4]. Miscanthus is a heterogeneous gramineous plant that hybridizes interspecifically, generating a complex genetic background. Despite increasing interest, molecular research into Miscanthus has been scant, due to its large genome (approximately 2.65 Gb) [5], complex genetic background, and limited available sequence data. Because inter- and intraspecific hybridization is common in Miscanthus, this genus is characterized by rich genetic diversity, and hybrid offspring often have high biomass [3]. However, this high degree of genetic diversity also increases the complexity of interspecific relationships in the Miscanthus, and, consequently, the difficulty of genetic evolution analyses in this genus. Therefore, it is difficult to mine functional genes in the Miscanthus, which seriously affects the utility of Miscanthus species for energy production and conversion [6]. Molecular markers would be useful for further investigations of Miscanthus plants; such markers have been widely used in studies of genetics, molecular population genetics, species formation, evolutionary and phylogenetic relationships, and molecular taxonomy [7].
First generation molecular markers include restriction fragment length polymorphisms (RFLPs) [8, 9], random amplified polymorphic DNA (RAPD) [10, 11], and amplified fragment length polymorphisms (AFLPs) [12], while second generation molecular markers include simple sequence repeats (SSRs) [13] and inter-simple sequence repeats (ISSRs) [14]. However, these markers have several limitations: they are low throughput, inaccurate, time-consuming, labor-intensive, and costly [3]. These drawbacks have motivated the development of third-generation molecular markers. Third-generation molecular markers are SNPs. These markers are polymorphic and a generally widely distributed throughout the whole genome [15]. SNP markers are amenable to large-scale automated monitoring and have been instrumental in various techniques associated with crop breeding, such as the construction of genetic maps, the DNA fingerprinting of germplasm resources, the detection of molecular biodiversity, and the analysis of linkage disequilibrium [16]. This continuous development of molecular marker technology has accelerated functional gene identification and characterization in other crops, and has led to the development of varieties with improved functional traits [6]. Thus, these techniques might be useful for molecular genetic research in Miscanthus.
Although genotyping-by-sequencing (GBS) and restriction site-associated DNA sequencing (RAD-seq) have been used extensively in Miscanthus, there are still some difficulties and challenges associated with the application of these techniques, especially to non-model taxa like Miscanthus [17]. One obstacle to the widespread use of GBS is the difficulty of the associated bioinformatics analysis, which is typically hampered by a large number of erroneous SNP interferences that are not easy to diagnose or correct [18].
To overcome these challenges, we aimed to develop and identify SNP markers for Miscanthus using SLAF-seq techniques. That is, we aimed to reduce the genomic complexity using specific digestion, develop markers via the high-throughput sequencing of representative libraries, and determine phylogenetic relationships using genotyping.
SLAF-seq uses bioinformatics methods to systematically analyze known genome sequences, genome sequences of related species, bacterial artificial chromosome (BAC) sequences, or Fosmid sequences [19–23]. SLAF-seq techniques differ in several ways from GBS or RAD-seq techniques. First, there are many more SLAF tags, with SLAF-seq identifying one tag about every 10 K; second, SLAF tags are uniformly distributed, so important chromosome segments are not missed; and third, SLAF-seq methods avoid repetitive sequences, thus improving sequencing cost-effectiveness. As such, SLAF-seq utilizes deep sequencing to ensure genotyping accuracy; a reduced representation strategy to reduce sequencing costs; a pre-designed representation scheme to optimize marker efficiency; and a double-barcode system for large populations [24].
SLAF-seq has been widely used for the development of specific molecular markers and genetic maps [25]. For example, Sun et al. used 50,530 SLAFs with 13,291 SNPs to genotype the F1 population of the common carp [24]. Due to its efficient identification of SNP markers, SLAF-seq has been used in a wide variety of crops [26–30]. SLAF-seq was also used to develop the first high-density genetic maps for several economically important species, including sesame [31], cucumber [23], the brown alga Undaria pinnatifida (Phaeophyceae) [32], wax gourd [33], watermelon (Citrullus lanatus L.) [34], and Salvia miltiorrhiza [25]. In addition, an increasing number of studies of the Gramineae have been performed using SLAF-seq [21, 22, 35, 36]. For example, SLAF-seq was used to develop the first 7E-chromosome-specific molecular markers for Thinopyrum elongatum [35], while 5,142 polymorphic SLAFs were analyzed to identify a new maize inflorescence meristem mutant [36]. Zhang et al. used 69,325 high-quality SLAFs, of which 26,248 were polymorphic, to develop sufficient markers for a segregating Agropyron F1 population [28]. Furthermore, a high-density genetic linkage map for orchard grass was developed using 2,467 SLAF markers and 43 SSR markers [37], and the semi-dwarf gene in barley was fine-mapped using molecular markers developed with SLAF-seq [21]. The successful application of SLAF-seq in other species provides reference materials for the development of SNPs in this study.
However, SLAF-seq has yet to be used to develop SNP markers in Miscanthus. Therefore, we aimed to use SLAF-seq to determine the genetic relationships among Miscanthus species, as well as to develop SNP markers. These results provide useful data for molecular marker-assisted Miscanthus breeding programs.