Potato has become an important staple food as it occupies the fourth position in terms of production after wheat, rice, and maize (Chung et al. 2006). However, potatoes have a higher calorie per unit area than wheat, rice, and maize (Abbas et al. 2013). Currently, potatoes are classified into 107 wild and 4 cultivated species. The popular “common potato” belongs to the cultivated species of S. tuberosum (Achakkagari et al. 2020).
Potatoes have gained importance in Pakistan for a long time. Potatoes were grown in the subcontinent before the independence of Pakistan. After the independence, cultivated area that came under Pakistan was 3000 hectares. But with passing time, potato became the fastest growing industry in the country and production increased to 195,654 hectares (Ahmed 2016). Potatoes in Pakistan are grown in different agro-climatic conditions and are divided into three crops of summer, spring, and autumn. Most of potato is produced in Punjab. However, production of potato is lowered in Pakistan as compared to other countries. The reason for this low production is the effect of many biotic and abiotic factors (Abbas et al. 2013). Plants and tubers from different cultivars are diverse in their morphology and commercial importance (Reddy et al. 2018). These differences impose the requirement for identification of potato cultivars that have better characteristics and greater demand in market. The reliable identification and assessment of cultivars are significant for genetic improvement programs, breeding, and germplasm conservation. The genetic diversity can be explored via evaluation of morphological parameters and molecular markers. In fact, efforts have been made to utilize morphological and molecular markers to identify cultivars with better characteristics, resistance to various biotic and abiotic factors, and have high demand in public. The genetic evaluation of cultivars via molecular markers is considered as a tool for the identification of varieties (Machida-Hirano 2015). Evolutionary relationships and diversity have been analyzed between cultivated and wild potatoes using these tools (Hardigan et al. 2017). The molecular markers that have been employed for the identification of potato cultivars include RAPD, AFLP, SSR, ISSR and SNPs (Das et al. 2010; Esfahani et al. 2009; Rahman et al. 2022; Priyadarshini et al. 2020; Gazendam et al. 2022). However, only RAPD and SSR have been employed in Pakistan for this purpose (Abbas et al. 2008; Rahman et al. 2022). Both markers have limitations as RAPD cannot be reproduced and SSR has some limitations related to throughput, cost, and the scoring of multiple alleles or stutter bands. Hence for the purpose, fragments of three genes were sequenced to find the genetic diversity of cultivars grown in Pakistan along with the analysis of morphological characteristics.
For morphological analysis, seventeen morphological characters were recorded for six cultivars. All the cultivars have shown variation in their morphological characteristics. A dendrogram was generated based on these characteristics. The dendrogram revealed two groups, one group has local cultivars Rubi and Cosmo whereas the other group contained the remaining cultivars. The presence of Sante, Hermes, and Desiree in the same group is supported by their similar ancestors.
The molecular analysis involved the sequencing of the three genes MDH, CHY2 and COX2. A fragment of MDH was sequenced and analyzed. The cultivars exhibited a little variation in MDH sequences. However, three novel SNPs were found in the intronic region of gene in addition to the reported one. All cultivars were polymorphic for the reported SNP. In addition to these SNPs, an insertion of 13 nucleotides was found in Desiree, Sante, and Hermes. Whereas it was absent in local cultivars Ruby and PRI-Red. Interestingly, the sequencing data was uninterpretable for cultivar Cosmo where the insertion was present suggesting that it might has more than two alleles. Similarly, the data was uninterpretable for CHY2 sequences due to the presence of multiple peaks and only, approximately 120 nt in sequencing data were readable for the reverse strand. This lack of harmony between the experimental results and predictions while designing the experiment may have arisen due to the heterozygous tetraploid genome of cultivated potato which has huge genetic diversity. Firstly, the sequence variation in the primer binding site may have reduced the efficient binding of primers. As genome of autotetraploid potato cultivars, recently released, has also manifested a great genetic diversity suggesting that among all the analyzed crop species, potato is the most complex genome and exhibits high degree of heterogeneity in genome that hinders the generation of high quality assembly (Hoopes et al. 2022). SNPs, Indels and copy number variations (CNVs) are dispersed throughout the potato genome. These sequence variations are denser in non-coding regions (Uitdewilligen et al. 2013), and CNVs are the main reason for the huge genetic diversity in potatoes. In fact, vegetatively propagated potatoes have higher rates of mutation as compared to many species that reproduce through sexual reproduction (Hardigan et al. 2016). Additionally, tetraploidy may also lead to increased heterozygosity as a gene may be represented by four alleles at a locus in contrast to diploid where a gene is represented by two alleles per locus. A study carried out to identify potato cultivars grown in south Africa has also reported the presence of two or more different alleles for SNPs (Gazendam et al. 2022). Similarly, the huge allelic diversity of cultivated tetraploid potatoes was also depicted in a study when compared to the wild relatives and landraces (Hardigan et al. 2017).
Furthermore, when BLASTn searches were performed for CHY2, it aligned at two locations on whole genome sequences of different potato cultivars depicting the presence duplication which is common in potato. In fact, thousands of duplications and insertion have been identified. Nearly half of the peculiar genes of potato lineage are effected by duplication or deletion (Hardigan et al. 2017). Additionally, specific gene may be duplicated in individual cultivar making it polymorphic for that gene (Uitdewilligen et al. 2013). All these factors contribute to the presence of more than two peaks in sequence read, thus making it difficult to interpret data. This is supported by our results as the sequencing of forward strand, for which primer was designed to bind the non-coding region, in all cultivars has generated multiple peaks suggesting the presence of variations. The sequencing of reverse strand whose primer was designed to bind the exon has generated a sequence read of 150–250 nt and then again, the data was uninterpretable probably due to the presence of variations in intronic region.
Phylogenetic tree generated for the MDH gene has revealed two groups. The group one has Ruby and PRI red with P8 (India), MSH/14–122 (India), and DM1-3 516 R44. While the group 2 has Cosmo, Desiree, Hermes, Sante and Solyntus (Dutch) placed in further subgroups. This is in accordance with the dendrogram based on morphological data as it has also placed the Ruby and PRI-red in separate group and the rest of cultivars in second group.
The sequence of COX2 was also analyzed and it has depicted near complete identity among all Pakistani cultivars. When aligned with whole the genome sequence of Desiree mitogenome, it was revealed that gene has duplicated copy however, it was identical to the wild type of gene. This is the complexity and uniqueness of the mitogenome in angiosperms that they exhibit high degree of structural rearrangements, size variations, duplications, gene loss, and incorporation of foreign sequences through horizontal gene transfer (Bock 2010; Kubo and Newton 2008). These rearrangements involving the recombination, indels, DNA transfer between organelles and sequence transpositions along with the paternal leakage lead to the heteroplasmy I.e the presence of stoichiometry DNA molecules in addition to the main genome (Woloszynska 2010). The mitogenome of potato is arranged in different linear and circular molecules of different sizes instead of main master circle. A study carried out to sequence mitogenome has revealed the presence of five different circular molecules (Cho et al. 2017), among which three molecules shared sequences with each other (Cho et al. 2022). While, another mitogenome reported for cultivars Desiree and Cicero revealed that it was arranged in two circular and one linear molecules (Varré et al. 2019). Similarly¸ three molecules of different sizes were reported to represent the mitogenome in different taxa of potato including S. tuberosum subsp. tuberosum (Achakkagari et al. 2021). Recently, mitogenome of six potato cultivars also revealed the arrangement of mitogenome into three independent molecules (Hoopes et al. 2022). Another mitogenome, reported for cultivar Alwara and four hybrid clones of potato, has revealed that among three molecules, only one was maintained as single molecule while the rest were rearranged rigorously due to the duplications, translocations and inversions (Sanetomo et al. 2022). It was reported that several genes are duplicated in cultivated potato mitogenomes among which COX2 has also an additional copy (Varré et al. 2019; Achakkagari et al. 2021; Sanetomo et al. 2022). Apart from the duplication, truncated gene of COX2 has been observed (Sanetomo et al. 2022). When blastn searches were performed against the reported genome of S. tuberosum, the result showed the complete identity among studied COX2 and those present on the reported mitogenome. However, fragments of gene aligned at different random locations of mitogenome. Interestingly apart from these, the sequence also aligned with the genomic DNA. This shows the complexity and huge diversity of potato genome which makes it difficult to study it effectively.