Plant material and DNA extraction
DNA was extracted from dried leaves or cambium of 98 C. schweinfurthii individuals from four populations (localities) across three countries (Cameroon, Burkina Faso and Côte d'Ivoire; see Fig. 1) using an adapted version of the protocol by Mariac et al. [12]. To ensure DNA purity and address the challenges posed by the high levels of phenolic, polysaccharidic, and glyco-lipidic compounds in the extracts that complicate DNA isolation, the homogenate of each sample was pretreated with a sorbitol-based extraction buffer solution (composed of 63.7g of sorbitol, 100mL of 1M Tris-HCl pH 8, 10mL of 0.5M EDTA pH 8, 5g of 0.5% sodium bisulfite added on the same day, and Milli-Q water to a final volume of 1000ml), The whole mixture is centrifuged before applying the extraction protocol [12]. Three washing series, each involving 1.5mL of buffer solution per sample and per wash, were performed.
Microsatellite development
A genomic library was prepared for two out of the 98 individuals (ARM0325 and ARM0326, Data availability and Supplementary Information 1) from Cameroon following the protocol by Mariac et al. [13], and sequenced in paired-end mode using Illumina MiSeq v2 reagents with 2 × 250 bp reads at Novogene (Munich, Germany). We obtained 1,463,279 and 1,458,372 raw reads for individuals ARM325 and ARM326, respectively. Reads were joined using FLASH v1.2.11[14], resulting in 1,035,067 and 1,010,583 joined reads for individuals ARM325 and ARM326, respectively.
The software QDD v3.1.2 [15] was used with default settings to identify microsatellite repeat units (nuclear SSR motifs) and design primers. We identified 8,432 and 9,647 reads with an SSR motifs for ARM325 and ARM326 respectively. These raw sequencing data and reads with SSRs have been deposited in GenBank SRA under accession PRJNA1125310 (available at: https://www.ncbi.nlm.nih.gov/sra/PRJNA1125310). 48 loci per individual were selected based on the following criteria: (1) perfect di- or trinucleotide microsatellite repeat units (i.e., non-composite); (2) motif repetitions with a minimum threshold of 10; (3) primers located at least 20 bp away from the microsatellite region; (4) PCR product size between 90 and 250 bp. For each primer pair, a tail with one of four fluorochromes was added to the 5' end of the Reverse or Forward primer(Q1-6-FAM, Q2-NED, Q3-VIC, Q4-PET [16]; see Table 1).
ThermoFisher Multiple Primer Analyzer web page (https://www.thermofisher.com/fr/fr/home/brands/thermo-scientific/molecular-biology/molecular-biology-learning-center/molecular-biology-resource-library/thermo-scientific-web-tools/multiple-primer-analyzer.html) was used to check for the presence of self-dimers and cross primer dimers for each of the 48 primer pairs selected for each individual. 52 loci were kept (19 in ARM325 and 33 loci in ARM326) for following analyses (test of amplification and polymorphism). For each of these 48 markers, an amplification test was carried out on 4 individuals in simplex reactions (name of individuals). A second test on 16 individuals was carried out on the xx markers that had amplified in order to assess their polymorphism. Finally, thirteen markers were found to be polymorphic and were selected for multiplex PCR. Primer sequences for these SSRs (see Table 1) were deposited in GenBank.
PCR conditions and microsatellite amplification tests
The PCR reactions were carried out in a total volume of 10 µL, consisting of: 0.15 µL of each forward primer [10 µM], 0.1 µL of each reverse primer [10 µM], 0.15 µL of each fluorescently labeled primer (Q1 to Q4) [16]) [10 µM], 1 µL of DNA (10 ng/µL), 4 µL of PCR Master Mix (Type-it Microsatellite, Qiagen), and the volume was completed to 10 µL with ultra-pure water (Invitrogen). The 13 selected markers are amplifiable in two multiplex reactions of 5 and 8 SSRs. Each multiplex PCR reaction consisted of: 0.15 µL of each forward primer [10 µM], 0.1 µL of each reverse primer [10 µM], 0.15 µL of each linker (Q1-Q4) [10 µM], 1 µL of DNA (10 ng/µL), 7.5 µL of PCR Master Mix (Type-it Microsatellite, Qiagen), and the volume was completed to 15 µL with ultra-pure water (Invitrogen). The PCRs were performed using a Veriti™ 96-Well Thermocycler (Thermo Fisher Scientific) under the following conditions: initial denaturation at 95°C (3 min), 30 cycles of denaturation at 95°C (30 s), annealing at 57°C (1 min 30 s), extension at 72°C (30 s), followed by 10 additional cycles with the annealing temperature adjusted to 53°C, and a final extension at 60°C (30 min). Using 2 µL of the PCR product, 10 µL of Hi-Di formamide (Applied Biosystems, Thermo Fisher Scientific), and 0.12 µL of GeneScan 500 LIZ Size Standard (Applied Biosystems, Thermo Fisher Scientific), genotyping of all samples was performed using an ABI 3500 XL sequencer (Applied Biosystems, Foster City, California, USA) at the CIRAD genotyping platform in Montpellier. Electropherograms were visualized and scored with Geneious version 7.1.3 [17]. Difficult-to-read loci and monomorphic loci were excluded. In total, 13 primer pairs were retained and combined into two multiplex reactions (the list of loci grouped in each multiplex is provided in Table 1) using Multiplex Manager 1.0 software [18].
Microsatellite diversity analysis
A genetic diversity analysis was conducted on a set of individuals (19 to 31 individuals per population; see Fig. 1 and Table 2) coming from four different populations of C. schweinfurthii from three countries: Cameroon (Meyo-Ville), Cameroon (Koupa-Matapit), Burkina Faso (Orodara), and Côte d'Ivoire (Bouaké). Genotyping was analyzed locus by locus using the automated procedure implemented in Geneious Prime® 2019.2.3 [17] and manually corrected. Genetic diversity was assessed in each population by estimating the number of alleles per locus (A), allelic richness (AR), observed heterozygosity (HO), expected heterozygosity (HE), the fixation index or inbreeding coefficient (FIS), and deviations from Hardy-Weinberg equilibrium (HWE) using the program SPAGeDi 1.5 [19]. Null allele frequencies (r) were calculated with INEST 2.2 [20].
Table 1
Characterization of 13 microsatellites loci developed for Canarium schweinfurthi
Mix
|
Microsatellite
marker
|
Primer sequences (5′–3′) a
|
Fluorescent label b
|
Repeat motif c
|
PCR product size
|
Allele size range (bp)
|
Genebank accession
No
|
Multiplex 1
|
CS02
|
F : TGTAAAACGACGGCCAGTTTCTCATCTCCACAAATTCCACA
|
Q1-6-FAM
|
(AT)10
|
123
|
130–185
|
PP933780
|
R : GGCCATCCAAGACCAGAACA
|
CS11
|
F : TAGGAGTGCAGCAAGCATCAGGAGGCCACATCGTCTAT
|
Q2-NED
|
(AG)12
|
140
|
125–170
|
PP933781
|
R : ATTTGTCCTGCTCCTGTGCT
|
CS36
|
F : CTAGTTATTGCTCAGCGGTTGCTGGTTACGCAGTTGCTA
|
Q4-PET
|
(AG)17
|
222
|
220–270
|
PP933782
|
R : TTCCAGAGGATTTCTTGATGTCT
|
CS37
|
F : CACTGCTTAGAGCGATGCAGAGTTCCTTACATAGTGTCACCA
|
Q3-VIC
|
(AT)11
|
222
|
220–270
|
PP933783
|
R : CTGCTCAGCTTACAATGACCA
|
CS50
|
F : TAACCATCTGCCGGTGTTCC
|
Q1-6-FAM
|
(AC)10
|
231
|
236–272
|
PP933784
|
R : TGTAAAACGACGGCCAGTTTCTGGCAAGGCAAGCAGAT
|
Multiplex 2
|
CS05
|
F : TGTAAAACGACGGCCAGTAACCAATACCGTCATCACCC
|
Q1-6-FAM
|
(AG)10
|
129
|
120–160
|
PP933785
|
R : CGAGAGGTGGTGTTGACATT
|
CS19
|
F : CTAGTTATTGCTCAGCGGTAGGTCAGCAGTCCTTTGACA
|
Q4-PET
|
(AG)23
|
151
|
130–185
|
PP933786
|
R : GGGTGGCATCTGGAACTCAT
|
CS20
|
F : CACTGCTTAGAGCGATGCTGAGAAACTGAGCAGTCGAGA
|
Q3-VIC
|
(AG)12
|
152
|
155–200
|
PP933787
|
R : TGGCATTTGAAACAGAAATCCTTCA
|
CS34
|
F : TAGGAGTGCAGCAAGCATCGTTGACACTCTCACTCGCT
|
Q2-NED
|
(AG)12
|
204
|
205–240
|
PP933788
|
R : GCAAGATTTAGCTGACCCAACA
|
CS40
|
F : CTAGTTATTGCTCAGCGGTAGAGTGGCTGATAAAGATCTATGCA
|
Q4-PET
|
(AT)21
|
131
|
210–295
|
PP933789
|
R : GGGCTGAATGTTGTCAAGGC
|
CS42
|
F : CACTGCTTAGAGCGATGCAGGGTATTGAAGATAAGTAGATTAGCA
|
Q3-VIC
|
(AT)15
|
240
|
225–285
|
PP933790
|
R : TGTTACTAGCATTCTCTTGCAGTG
|
CS49
|
F : AGAATCCCAGATGTAACACCTGT
|
Q1-6-FAM
|
(AG)11
|
241
|
245–290
|
PP933791
|
R : TGTAAAACGACGGCCAGTTGTTAGTCCGTGCGTGTCAC
|
CS52
|
F : CATCCAGACAGTATGAATGGCA
|
Q2-NED
|
(AG)17
|
143
|
135–187
|
PP933792
|
R : TAGGAGTGCAGCAAGCATCTGACCTAATTGTTCAAAGCAGCT
|
a The universal linkers attached to the forward primers are underlined |
b Q1 = TGTAAAACGACGGCCAGT; Q2 = TAGGAGTGCAGCAAGCAT; Q3 = CACTGCTTAGAGCGATGC; Q4 = CTAGTT ATTGCT CAG CGG T, [16].
c Number of repeats found in the clone that corresponds to the accession number
d Optimal annealing temperature was 57° C for all loci
Table 2 Genetic properties of the 13 newly developed polymorphic nuclear microsatellites markers in four populations of Canarium schweinfurthii
Multiplex
|
Locus
|
Cameroon (Meyo-Ville, N=31)
|
Cameroon (Koupa-Matapit, N=25)
|
Burkina Faso (Orodara, N=23)
|
A
|
AR
|
H0
|
HE
|
FIS a
|
r
|
A
|
AR
|
H0
|
HE
|
FIS a
|
r
|
A
|
AR
|
H0
|
HE
|
FIS a
|
r
|
Multiplex-1
|
CS02
|
12
|
10.18
|
0.839
|
0.856
|
0.020
|
0±0.00
|
14
|
12.74
|
0.64
|
0.868
|
0.266***
|
0.11±0.05
|
15
|
14.16
|
0.783
|
0.906
|
0.139*
|
0.06±0.05
|
CS11
|
6
|
5.45
|
0.484
|
0.654
|
0.264*
|
0.09±0.06
|
5
|
4.69
|
0.56
|
0.612
|
0.087
|
0.007±0.03
|
5
|
4.80
|
0.348
|
0.490
|
0.295
|
0.07±0.05
|
CS36
|
9
|
8.63
|
0.700
|
0.834
|
0.163
|
0.05±0.06
|
8
|
7.37
|
0.625
|
0.808
|
0.230*
|
0.12±0.08
|
7
|
6.79
|
0.478
|
0.634
|
0.250*
|
0.08±0.05
|
CS37
|
14
|
12.14
|
0.767
|
0.859
|
0.109
|
0.03±0.05
|
10
|
9.27
|
0.609
|
0.821
|
0.263**
|
0.13±0.09
|
8
|
7.7.62
|
0.826
|
0.830
|
0.004
|
0±0.00
|
CS50
|
6
|
5.65
|
0.742
|
0.675
|
-0.100
|
0.04±0.04
|
3
|
2.99
|
0.28
|
0.286
|
0.023
|
0±0.00
|
4
|
4.00
|
0.565
|
0.712
|
0.210
|
0.07±0.07
|
Multiplex-2
|
CS05
|
5
|
4.61
|
0.419
|
0.698
|
0.403**
|
0.16±0.05
|
4
|
4.00
|
0.28
|
0.711
|
0.633***
|
0.25±0.06
|
5
|
4.85
|
0.455
|
0.591
|
0.235
|
0.11±0.08
|
CS19
|
13
|
11.27
|
0.871
|
0.865
|
0.007
|
0±0.00
|
12
|
10.68
|
0.84
|
0.806
|
-0.042
|
0±0.00
|
11
|
10.30
|
0.818
|
0.800
|
-0.023
|
0±0.00
|
CS20
|
13
|
11.80
|
0.548
|
0.905
|
0.398***
|
0.18±0.05
|
11
|
10.26
|
0.36
|
0.884
|
0.598***
|
0.27±0.05
|
8
|
7.71
|
0.667
|
0.803
|
0.173
|
0.11±0.06
|
CS34
|
10
|
7.99
|
0.645
|
0.701
|
0.081
|
0±0.00
|
11
|
9.92
|
0.68
|
0.805
|
0.158
|
0.05±0.05
|
7
|
6.85
|
0.682
|
0.787
|
0.136
|
0.07±0.02
|
CS40
|
13
|
10.83
|
0.633
|
0.826
|
0.236*
|
0.12±0.07
|
14
|
12.16
|
0.68
|
0.861
|
0.214*
|
0.09±0.05
|
9
|
8.70
|
0.762
|
0.757
|
-0.006
|
0.05±0.02
|
CS42
|
16
|
13.07
|
0.667
|
0.888
|
0.252**
|
0.13±0.07
|
17
|
15.51
|
0.48
|
0.934
|
0.491***
|
0.23±0.05
|
14
|
13.32
|
1.000
|
0.873
|
-0.149*
|
0±0.00
|
CS49
|
22
|
17.25
|
0.900
|
0.923
|
0.025
|
0.006±0.02
|
16
|
14.04
|
0.68
|
0.878
|
0.229**
|
0.08±0.06
|
10
|
9.80
|
0.857
|
0.884
|
0.031
|
0±0.00
|
CS52
|
9
|
7.67
|
0.677
|
0.784
|
0.138
|
0.07±0.05
|
3
|
3.00
|
0.56
|
0.638
|
0.124
|
0.04±0.07
|
6
|
5.90
|
0.619
|
0.681
|
0.092
|
0.006±0.04
|
Multilocus mean
|
11.38
|
9.74
|
0.684
|
0.805
|
0.153***
|
0.07±0.06
|
9.85
|
8.97
|
0.56
|
0.763
|
0.270***
|
0.11±0.09
|
8.38
|
8.06
|
0.681
|
0.500
|
0.093***
|
0.05±0.04
|
Table 2 cont'd
Multiplex
|
Locus
|
Côte d’Ivoire (Bouaké, N=19)
|
A
|
AR
|
H0
|
HE
|
FIS a
|
r
|
Multiplex-1
|
CS02
|
15
|
15
|
0.842
|
0.94
|
0.107
|
0.05±0.05
|
CS11
|
6
|
6
|
0.263
|
0.376
|
0.305*
|
0.08±0.09
|
CS36
|
7
|
7
|
0.579
|
0.869
|
0.34***
|
0.14±0.07
|
CS37
|
9
|
9
|
0.737
|
0.828
|
0.113
|
0.04±0.06
|
CS50
|
3
|
3
|
0.526
|
0.519
|
-0.014
|
0±0
|
Multiplex-2
|
CS05
|
4
|
4
|
0.632
|
0.741
|
0.151
|
0.06±0.07
|
CS19
|
12
|
12
|
0.842
|
0.91
|
0.075
|
0.01±0.03
|
CS20
|
10
|
10
|
0.526
|
0.89
|
0.415***
|
0.18±0.06
|
CS34
|
7
|
7
|
0.684
|
0.673
|
-0.017
|
0±0
|
CS40
|
7
|
7
|
0.632
|
0.68
|
0.073
|
0.05±0.06
|
CS42
|
12
|
12
|
0.789
|
0.898
|
0.123
|
0.05±0.06
|
CS49
|
9
|
9
|
0.632
|
0.838
|
0.251*
|
0.12±0.06
|
CS52
|
9
|
9
|
0.737
|
0.624
|
-0.186
|
0±0
|
Multilocus mean
|
8.46
|
8.46
|
0.648
|
0.753
|
0.143***
|
0.06±0.06
|
Note N individuals number sampled, A number of alleles, AR (K=38 gene copies) allelic richness, H0 observed heterozygosity, HE expected heterozygosity, FIS inbreeding coefficient or fixation index, r frequency of null alleles, aSignificance of deviation from Hardy–Weinberg equilibrium (HWE): *P < 0.05, **P < 0.01,***P < 0.001, Total : total observed among all four populations