Genomic Diversity, Phylogenetic Relationships and Population Structure
We collected 82 individuals from six Chinese native pig breeds, including Ding'an pigs (DA), Tunchang pigs (TUC), Wuzhishsan pigs (WZS), Min pigs (MZ), Hetao pigs (HT), and Tibetan pigs (TP), which spread over three classical geographical regions, i.e., tropical, frigid, and high-altitude environments (Fig. 1A and Table S1). All 82 individuals were sequenced at depths larger than 10× (Table S2). After applying stringent quality control criteria, we identified a total of 25,602,818 SNPs. By comparing the SNP set with the pig dbSNP database (Fig. 1B), we found that more than 13.5% of the variants (3,466,300 SNPs) were novel, which substantially expanded the catalog of porcine genetic variants. Further functional annotation revealed that most SNPs (64.12%) were located in the intronic region, followed by intergenic regions (22.82%) (Fig. S1). Besides, 1.21% of SNPs were identified in coding regions, of which 102,024 nonsynonymous variants (100,954 missense, 874 stop gain, and 196 stop loss).
LD generally decayed as the distance between loci increased, and the strength of LD varied widely between populations. The physical distance between SNPs measured as half of its maximal value occurred at 34.1 kb (r2 = 0.34) for DA and at 1.8–5.7 kb (r2 = 0.23–0.31) for the rest five pig breeds. At longer marker distances, the LD value was highest for DA but lowest for Tibetan pigs (Fig. 1C and Table S3).
To infer the genetic and evolutionary relationships among pig breeds adapted to different environments, we first constructed a phylogenetic tree of 82 individuals using the Neighbor-Joining (NJ) algorithm (Fig. 1D). The phylogenetic tree revealed distinct groupings of individuals from different regions. Specifically, the genetic relationships among the six breeds were strongly associated with their habitats, with three pig breeds from Hainan (WZS, DA, and TUC) exhibiting closer genetic relationships. Within the Tibetan pig breed, interestingly, the internal genetic relationships also showed significant geographical partitioning (Fig. 1D). The results of PCA were consistent with those of the phylogenetic tree, with the first principal component (PC1 = 6.54%) and the second principal component (PC2 = 5.10%) able to separate the six breeds by geographical regions (Fig. S2).
Population structure analysis revealed that the optimal number of clusters was three, at which point the cross-validation error was lowest, and the results were considered most reliable (Fig. 1E). When K = 2, the three pig breeds from Hainan, Tibetan pigs, and Hetao pigs shared more ancestral components; when K = 3, we observed that pig breeds clustered by geographical region, consistent with previous phylogenetic tree and PCA results; when K = 4, Hetao pigs were separated, consistent with the fact that their distribution area does not overlap with other pig breeds. The admixture results further confirmed that the genetic relationships among pig breeds were closely related to their geographical distribution.
Selection signatures on autosomes and functional annotation
To better leverage the diversity of our dataset, we partitioned the three local population samples into four groups based on the putative population structure: the high-temperature group (HP, consisting of Hainan populations), the low-temperature group (NTP, consisting of NorthChina populations and Tibetan populations), the high-altitude group (TP, consisting of Tibetan populations), and the low altitude group (HNP, consisting of Hainan populations and NorthChina populations). To elucidate the selective pattern of pigs in tropical environments, we conducted a comparative analysis between HP and NTP to detect selection signals based on the genomic windows containing more than 100 SNPs (Fig. S3). By applying the top 5% cutoffs for both FST and XP-EHH, we identified 55.50 Mb and 20.47 Mb of selective sweep regions in HP and NTP, respectively (Table S4). A similar approach was employed to compare TP and HNP to investigate the adaptive mechanisms in high-altitude environments, revealing 43.60 Mb of selective sweep regions in TP (Table S5). To focus on the unique regions associated with frigid adaptation, we excluded the overlapped regions between the NTP selection and TP selection from the NTP selection and finally got 17.72 Mb regions (Table S6). The top 25 sweep regions with the highest FST and XP-EHH scores within the candidate genomic areas were considered as highly significant regions (Fig. 2).
We first performed an in-depth exploration of the selective pattern in tropical adaptation. Most genes located in the highly significant regions are functionally plausible for tropical adaptation, according to their annotation in previous studies. These genes included VPS13A, GNA14, and NR6A1, which were involved in blood coagulation and circulation [39]; STIMATE and NR5A1, which participate in the temperature stress response [40–42]; AGMO affecting human inflammation and energy homeostasis [43]; LMTK2 associated with cell apoptosis [44]; and CFAP299, which may affect the hair phenotype of yaks [45]. We explored the potential biological function of detected signals with publicly available QTLs (Table S7) and gene ontology (GO) (Table S8). QTL enrichment analyses showed that health, meat, and carcass traits were mostly significantly enriched (Fig. S4A). We noticed significant enrichment for "Cholesterol level" and "Mean platelet volume" (Fig. S4B). The positively selected genes (PSGs) in tropical adaptation were mainly associated with blood circulation, protein degradation, and inflammation, including "blood vessel diameter maintenance" (HRH2, HTR7, DBH, ADM, OLR1, ATP2B1, and HRH1), "lipid translocation" (MFSD2A, ANO3, and ANO4), "NIK/NF-kappaB signaling" (LRRC19, MAP3K7, NFAT5, and NLRP3), and "proteasomal ubiquitin-independent protein catabolic process" (PSMA1, PSMB7, and PSMB11). Improving blood flow to the surrounding skin can mitigate the effects of heat stress in tropical environments [46]. Previous research has demonstrated that the NF-κB pathway can stimulate HSP activation in immune cells [47], contributing to reducing heat stress, and the proteasomal ubiquitin-independent protein catabolic process can degrade misconfigured proteins caused by heat stress, reducing damage [48].
We next explored the mechanisms of hypoxia tolerance in Tibetan pigs. In the highly significant regions of high-altitude adaptation-specific selection, two genes affecting the cardiovascular system were identified: SOX18, which was associated with the regulation of blood vessel development [49], and TNNI3K, which affects heart function [50, 51]. Additionally, in line with previous research [52], the candidate gene EPAS1 in the hypoxia-inducible factor pathway [53] showed substantial selection, but the gene EGLN1 in the same pathway did not. By annotating specific selected regions, we discovered significant enrichment for blood index-associated traits, such as "Hemoglobin" and "Plateletcrit" (Fig. 3A and Table S9) Gene Ontology analysis revealed an overrepresentation of genes involved in biological processes that contribute to maintaining typical vital signs in high-altitude environments. (Fig. 3B and Table S10). PSGs detected in Tibetan pigs have particularly enriched in hypoxia adaptation-related processes, including "cardiac cell development" (MTOR, HEY2, SMAD4, SRF, TBX3), "coronary vascular development" (HEY2, SRF, PTK7), "platelet-derived growth factor receptor signaling pathway" (CBLB, PTGIR, PHF14, APOD), "response to hypoxia" (SCAP, MTOR, SMAD4, SRF), and "respiratory tube development" (SPRY1, CCDC39, PHF14, SRF, CCBE1, PTK7). Additionally, we observed enrichment for the nucleotide metabolism process (DPYS, MTOR, NME1, NME2, UOX, RHOQ, DERA), including "pyrimidine-containing compound metabolic process", "deoxyribose phosphate metabolic process", and "GTP metabolic process", which provided the basis for DNA repair and help to maintain genome stability by repairing UV-induced errors during DNA replication [54].
For frigid adaptation, we detected four genes in the highly significant regions, including HIF3A associated with adiposity [55]; JMJD1C affecting de novo lipogenesis [56]; RLN3 associated with food intake [57, 58]; PAQR9 activating thermogenesis of brown adipocyte [59]. The function of these genes is critical for frigid adaptation. The signal of frigid adaptation was mainly associated with meat and carcass traits, especially for "Fat area percentage in carcass", which may contribute to heat retention (Fig. S5A and Table S11). Biological process enrichment analysis showed the process involved in thermogenesis "regulation of fibroblast growth factor receptor signaling pathway" (HHIP, SPRY1). It is well documented that elevated levels of FGF21 (Fibroblast Growth Factor 21) promote beige adipose tissue and enhance energy expenditure [60, 61]. We also found that domestic pigs also exhibit cold-induced vasodilation (CIVD), the enrichment of "vasodilation". CIVD is a dramatic increase in peripheral blood flow observed during cold exposure. It supposedly protects against cold injuries [62]. Meanwhile, we noticed the enrichment for pathways related to cell cycle (CENPW, INO80, DLGAP5, APC, CCDC61, NCAPG, SPRY1, PTPN11, REEP3), in line with the fact that the mammalian cell cycle is temperature sensitive [63] (Fig. S5B and Table S12).