Test animals
In this experiment, 286 Qinchuan cattle and 270 Belgian Blue cattle were analyzed, and their genotype data were obtained from NCBI's GEO dataset (PRJNA390539) and figshare (https://figshare.com/) (10.6084/m9.figshare.17025086), respectively.
SNP chip quality control and imputing
We extracted the GEO dataset Forward strand converted to a format recognizable by the PLINK software, and then merged the files and performed quality control of the SNP data using the PLINK (version number 1.90) software. The following filters were applied to the data of samples and loci: (1) loci with SNP detection rate less than 90% were excluded (-geno 0.1); (2) individual samples with sample detection rate less than 90% were excluded (-mind 0.1); and (3) minimum allele frequency greater than 5% (-maf 0.05). Beagle 5.1 (https://faculty.washington.edu/browning/beagle) was used to populate the genome with fixed phases.
LD calculation method
Studying linkage disequilibrium (LD) decline can reveal the history of population recombination, and by analyzing LD between loci, it can help understand LD levels in the genomes of three beef cattle populations. In this study, we used PopLDdecay (V3.41) software to[10] to calculate the LD of SNP loci on the genome, the value of LD ranges from 0 to 1, and the degree of chaining increases as the value of r2 increases. r2 calculation process[11] Finally, perl scripts were used to visualize the results and plot the distribution of Mean_r2 in different groups.
Estimation of effective group size
Effective population size (Ne) is a key population genetics parameter that measures the rate of genetic drift and inbreeding, and influences the efficiency of systematic evolutionary forces (e.g., mutation, selection, and migration). We used the SMC++ method[12] to estimate Ne. SMC++ predicts the population size history and timing of splits. This method employs a new spline regularization scheme that greatly reduces estimation error. We used the vcf2smc script distributed with SMC++ to convert each VCF file into an input file in SMC++ format. All simulations were performed under initial conditions with a mutation rate of 1.26e-8[13] .
Diversity of population structure
Principal component analysis(PCA) based on variance normalized relationship matrix was performed on the data after quality control using Plink (V1.90) software, NJ matrix was calculated using VCF2Dis (https://github.com/BGI-shenzhen/VCF2Dis) and evolutionary trees were generated using FastME (http://www.atgc- montpellier.fr/fastme/) to generate evolutionary trees.
Fixation Index
Fst is a statistical test measure of the degree of differentiation between populations, which is mainly used to study the degree of genetic variation between different populations as well as population structure and population genetic diversity. It assesses genetic differences between populations by calculating the mean square of error (MSG) for loci within a population and the mean square difference (MSP) for loci between populations.The formula for Fst is given below:
where MSG denotes the mean square of the error of the detected intra-population loci, MSP denotes the mean square of the detected inter-population loci, and N is the corrected inter-population mean sample size. In this study, the vcftools software (version 1.90) was used to calculate the SNP loci left by the Fst statistic calculation, which is an estimate based on Weir and Cockerham's 1984 paper[14] .
Nucleotide diversity (PI)
Nucleotide diversity (PI) is the value obtained by randomly selecting a segment of DNA sequences from multiple samples in a given population and then averaging the bases of these sequences at the same site. It is an important measure of nucleotide polymorphism within a population. The formula for calculating nucleotide diversity is given below:
where S denotes the number of segregating sites and hj denotes the heterozygosity of the jth segregating site. The PI values can be obtained by calculating the nucleotide diversity of the populations using vcftools software. In this study, the degree of selection of this SNP locus in genomic perspective was calculated by counting the ratio of PI on the same SNP locus in Qinchuan cattle and Belgian Red and Belgian Red and White cattle. When the ratio is smaller, it indicates that the nucleotide diversity of Qinchuan cattle population is higher compared with that of Belgian cattle, which indicates that the locus is subjected to a smaller and larger degree of selection on the genome of Qinchuan cattle.
Enrichment analysis of candidate genes
The top 5% of the screened loci were annotated as selected loci with reference to bovine genome UMD_3.1.1. Referring to the NCBI database (http://www.ncbi.nlm.nih.gov/gene), the R language clusterProfiler package was used[15] For autosomal enrichment of candidate genes for GO enrichment analysis:Biological process, Cellular component, Molecular function and kegg pathway enrichment analysis.