Study species and sampling locations
We genotyped individual willow warblers from blood and feather DNA samples collected from breeding grounds in Scandinavia, during autumn migration at three locations in southern Europe and on wintering grounds in Africa (Table S1). To determine the genotype frequencies of allopatric trochilus and acredula, we used samples from previous publications [19] that consisted of territorial males from southern Scandinavia (Denmark and southern Sweden, latitude between 55°-60°, n = 66) and northern Scandinavia (Finland, Norway and northern Sweden, latitude above 65°, n = 60), both data sets are from areas well outside the contact zones. The autumn migration samples were collected between August - October in Portugal (n = 33), Italy (n = 38) and Bulgaria (n = 32). The samples from the wintering range (November - April) were collected in Ivory Coast (n = 12), Cameron (n = 27), Kenya (n = 3), Tanzania (n = 4), Zambia (n = 15) and South Africa (n = 8).
DNA extraction and quantification
Birds were captured in mist-nets and from each individual ~ 20–50 µl of blood was taken from the brachial vein and stored in SET buffer (0.15M NaCl, 0.001M EDTA and 0.05M Tris). Genomic DNA was extracted using an ammonium acetate protocol [27] and diluted to a concentration of ~ 1 ng/µl.
The base of feather samples (~ 5 mm of either tail feathers or innermost primary flight feathers) were cut into two strips to expose the inner structure, and added to tubes with 100 µl lysis buffer and 1.5 proteinase K and incubated at 56 ºC for 3 hours with regular shaking every hour. Following the digestion, 10 µl Sodium Acetate (NaAc) (3M) and 220 µl 95% cold ethanol was used to precipitate the DNA. The DNA was pelleted by high speed centrifugation and the pellet was washed in 100 µl 70% cold ethanol. The extracted DNA was dissolved in 1xTE and diluted to a concentration of ~ 2 ng/µl.
SNP selection
A previous study based on whole-genome resequencing and genotyping by a SNP array [25] identified two large blocks on chromosome 1 (13 Mb) and chromosome 5 (4 Mb) that carried highly divergent and non-recombining haplotypes between southern (trochilus) and northern populations (acredula). From the set of variants called from the resequencing data of nine southern and nine northern willow warblers [25], we first selected four highly differentiated SNPs (two per chromosome) for developing a qPCR assays for differentiating trochilus and acredula. We named the two on chromosome 1 “23” and “65”, and the two on chromosome 5 “285” and “412”, corresponding to the scaffolds on which they were located (Fig. S1). All the selected SNPs were located within regions of 200 bp that in the resequencing data contained very few SNPs, in order to avoid having primers and probes overlapping polymorphic sites.
Primer and probes
We used Thermo Fisher’s online Custom TaqMan® Assay Design Tool to design the probes and primers for the all four SNPs. For each SNP, a 400 bp sequence with the target SNP in the middle was submitted to the server, in which all other irrelevant SNPs were marked as “N”. The tool then automatically generated the forward and reverse primers as well as the probes to detect different alleles of the target SNPs. Fluorophores attached to the beacons were FAM and VIC. The designed primers and probes, are given in Table S2.
SNP genotyping
The genotyping of the target SNPs where done using either of two different qPCR instruments; a MX3005P (Stratagene, La Jolla, CA, USA) or a BioRad CFX96™ Real-time PCR system (Bio-Rad Laboratories, CA, USA). On both instruments we used the universal Fast-two-steps protocol (95ºC, 15 min – 40*(95ºC, 10 s − 60ºC, 30 s, plate read)).
Samples were run in 10 µl volumes with 5 µl TaqMan® Genotyping Master Mix, 0.5 µl 20X genotype assay (containing primers and probes) and 4.5 µl diluted DNA samples (~ 2 ng/µl for feather DNA samples and ~ 1 ng/µl for blood DNA samples). We first tested the genotype assay on DNA from both blood and feather for a limited number of samples (n = 8), with duplicates to examine the consistency and success rate of this strategy. We also examined blood and feather DNA samples from the same individuals (n = 21) to confirm the repeatability of the results. The qPCRs for all the samples in the present study were run on 96-plates including both negative and two positive controls. The positive controls consisted of samples with genotypes fixed for the two different alleles for each of the SNP assays.
We used the allelic discrimination functions provided by the software of the two qPCR instruments to generate the scatterplots of the fluorescence signals; Bio-Rad CFX Maestro™ 1.1 software and Mx3005P qPCR software, respectively. Examples of the results, are illustrated in Figure S2a (Bio-Rad CFX) and Figure S2b (Mx3005P).
Consistency and fixation levels of the analyzed SNPs
We used the above mentioned pure trochilus and acredula breeding samples from Scandinavia to evaluate how well the single SNPs can be used to predict the haplotype assignment on the regions of chromosomes 1 and 5, as well as their fixation level, i.e. allele frequencies in the two populations. From the SNP array [25], each individual had previously been classified as northern homozygous, southern homozygous or heterozygous for the regions on chromosome 1 and 5 based on a multidimensional scaling (MDS) analysis of 108 and 31 SNPs, respectively. We found that the single SNPs genotyped by qPCR were generally good at predicting the haplotypes assigned by the multiple SNP array [25] (Table S3). The consistency ratios for chromosome 1 were 95.9% for SNP 23 and 99.0% for SNP 65. For chromosome 5, both SNP 285 and SNP 412 had a consistency ratio of 97.3%. This demonstrates that each of the selected SNPs has a high power to correctly assign the haplotype blocks as trochilus and acredula, respectively.
Table 1
Allele frequencies at four selected SNPs in southern and northern Scandinavian populations of willow warblers
SNP | Allele frequencies |
Southern population (n = 66) | Northern population (n = 60) |
23 (T / G) | 0.955 / 0.045 | 0.050 / 0.950 |
65 (C / A) | 0.985 / 0.015 | 0.050 / 0.950 |
285 (A / G) | 0.970 / 0.030 | 0.084 / 0.916 |
412 (G / A) | 0.963 / 0.037 | 0.091 / 0.908 |
SNP 23 and 65 are located on chromosome 1, SNP 285 and 412 on chromosome 5, respectively. |
All the four investigated SNPs showed a high level of fixation of the presumed subspecies-specific alleles (Table 1, Fig S3). Based on these results (consistency rates with the SNP array and fixation levels), we selected SNP 65 on chromosome 1 and SNP 285 on chromosome 5 for population assignment, and the other two SNPs for verifying hybrid genotypes.
Assignment of genotypes
Our selected SNPs were all bi-allelic, so for each SNP locus there are three possible genotypes, denoted by N for the northern homozygous genotype, H for the heterozygous genotype, and S for the southern homozygous genotype. As we were using two SNPs (65 from chromosome 1 and 285 from chromosome 5), there are 9 possible combined genotypes: NN, NH, HN, NS, HH, SN, SH, HS and SS. We used the allele frequencies in the southern and northern Scandinavian willow warbler populations (Table 1) to estimate the expected occurrence rate of each genotype within these populations, under the assumption of Hardy-Weinberg equilibrium (Table 2).
Table 2
Expected occurrence rate of each genotype in southern and northern populations calculated from the subspecies-specific allele frequencies (Table 1) and assuming Hardy-Weinberg Equilibrium and similar population size between southern and northern population.
genotype | expected occurrence rate | |
southern population | northern population | Log likelihood ratioa |
SS | 91.288% | 0.002% | 4.659 |
SH | 5.647% | 0.038% | 2.172 |
HS | 2.780% | 0.065% | 1.631 |
SN | 0.087% | 0.210% | 0.382 |
HH | 0.172% | 1.446% | 0.924 |
NS | 0.021% | 0.622% | 1.471 |
HN | 0.003% | 7.988% | 3.425 |
NH | 0.001% | 13.738% | 4.137 |
NN | 0.000% | 75.890% | 4.926 |
a. calculated as the 10-log difference in occurrence rate (highest – lowest) |
The chance of finding the genotypes SS and SH are much higher (45000 and 148 times more likely) in a southern than in a northern population (Table 2). Likewise, each of the genotypes NN, NH and HN are > 2600 times more likely to be found in an acredula than in a trochilus population. Importantly, the occurrence probabilities of heterozygotes (HH) or mismatching homozygotes (SN, NS) were low (< 1.5%) in both the southern and northern populations (Table 2). Based on these calculations, we conclude that genotypes HH, SN and NS are extremely unlikely to originate from populations of pure trochilus or acredula.
For the following analyses we therefore assume that genotypes SS, HS and SH mostly originate from populations of allopatric trochilus and individuals with the genotypes NN, NH and HN from populations of allopatric acredula. We further assume that the genotypes HH consist of F1 hybrid offspring from trochilus x acredula parents and originate from any of the two migratory divides, as do the genotypes SN and NS which are likely to be F2 offspring from HH x HH parents. To confirm genotypes of the supposed hybrid individuals, we genotyped these samples for the additional SNPs (23 and 412). Willow warblers that had mismatching genotypes for the two SNPs on the same chromosome were classified as “undefined”.
We used Binomial Probability Test to estimate whether the F1 heterozygotes (HH) were over-represented in migratory sampling sites.