Sample collection and genomic characterization
We obtained isolates of P. ramorum from nurseries in BC, Canada between 1995 and 2017 (Table S2). Genome sequences were obtained either using Illumina Hiseq or Ion Torrent S5. Sequence reads were mapped onto the P. ramorum NA1 reference genome PR-102_v3.1 (https://doi.org/10.1101/2021.06.23.449625; BioProject PRJNA738483). We also retrieved previously sequenced genomes from Canada22, the US45 and Europe22,46 at NCBI (https://www.ncbi.nlm.nih.gov/bioproject/) for a total of 95 P. ramorum genomes. Variant Call Format (VCF) files were obtained for all genomes as described in the supplementary materials. Mitochondrial haplotypes were obtained by mapping the reads on the P. ramorum reference genome (Pr102, NCBI accession: NC_009384.147) and variants were called.
Population genomics analyses
Population structuring and genetic connectivity of the unknown isolates with the four lineages of P. ramorum was evaluated using Principal Component Analysis (PCA) with SNPrelate48. Linkage disequilibrium‐pruning was applied on the VCF dataset of 450,656 SNP loci to subsample 3,780 markers with reduced linkage (R2 < 0.20). Ancestry estimation was conducted using Admixture among the P. ramorum lineages with the full SNP set with K=4, the number of lineages found in North American and Europe in P. ramorum. To assess reticulated relationships between P. ramorum individuals a neighbor-net phylogenetic network was reconstructed using SplitsTree v4.12.849 with pairwise Nei’s genetic distances with the R package StAMPP50.
Genome-wide hybridization detection and gene flow simulations
We used the function hybridization in the Adegenet R package to simulate 10 F1 hybrids between NA2 and EU1 and 10 backcrosses of the hybrid to the EU1 or NA2 parent. We generated a pairwise distance matrix with the dist function and obtained an NJ tree to visualize the position of the observed samples and simulation of hybrids and introgressants. We used the program HyDe51 to perform a genome-wide test of interspecific hybridization between P. ramorum lineages. HyDe considers a four-taxa network consisting of an outgroup and a triplet of ingroup taxa. The distribution of SNP site patterns in the triplet is used to infer a hybrid ingroup lineage that with a probability γ is sister to one lineage and with probability 1 – y is sister to the other lineage. Under the null hypothesis that admixture is absent, the expected value of y is zero. HyDe testing was performed with the ‘run_hyde_mp.py’ script of the HyDe package on all possible triplet combinations of putative parents-hybrid with NA1, NA2, EU1 and the 16-237-021 and 16-284-032 isolates (i.e. 12 triplets tested) while EU2 samples were used as outgroup. To obtain average γ-values with standard deviations, the full dataset of 450,656 SNP loci was subsampled 500 times for 10,000 random loci. The Z-statistic was used to test for significance of the y-values, and p-values were adjusted for multiple testing using Bonferroni adjustment (α = 0.05 / 12 = 0.0041).
Haplotype phylogenetic network reconstruction
Phased haplotypes were obtained by using short genome regions with physical phasing when two or more variants co-occur on the same sequencing read using WhatsHap52. We tested the 15,265 genes annotated in the P. ramorum NA1 reference genome PR-102_v3.1 to identify genes containing at least 10 phased SNP loci in the two hybrid samples and three samples from each P. ramorum lineage. A FASTA format sequence alignment file containing two haplotype sequences for each P. ramorum samples (including the two hybrids) was then reconstructed as follows: the single nucleotide haplotypes predicted from a phased genotype were coded with their respective allele; the single nucleotide haplotypes predicted from an unphased genotype were coded as missing data. Finally, sequence alignments were collapsed to unique haplotypes within each P. ramorum lineage. Maximum likelihood (ML) gene trees were inferred using RAxML53 under the GTR model with a GAMMA parameter. Bootstrap support at nodes was determined by 1,000 iterations.
Functional impact and phenotypic characterization
The potential impact of the SNPs was categorized with regard to types and function with SnpEff (Version 4.3) with default settings. We used the reference genome (PR-102_v3.1) and the gff file to build the annotation database. Growth and morphological characterization was performed on carrot agar (CA)18 for the two hybrids and selected isolates of the EU1, NA2, EU2 and NA1 lineages.