COI and COII haplotypes in the BMSB populations
A total of 441 COI sequences (657 bp each) were obtained from 463 BMSB individuals collected from the 12 countries (Additional file 1). We identified 51 haplotypes using COI, consisting of 36 newly identified and 15 previously reported haplotypes (Additional file 2). For the sequences shared 100% identity in the same region (657 bp) with those previous reported, the same haplotype names were given while new names were given accordingly for the rest sequences obtained in this study. All the new haplotypes identified were confirmed by BLAST. The result showed all the new haplotypes were unique in the region (657 bp). Further analysis showed that N22 shared the same sequences with two shorter reference sequences, KY710432 (651 bp) and KY710450 (648 bp). However, it is not clear whether the missing bases from the two reference sequences are the same or different from the sequence we obtained, thus the sequence was considered as a new haplotype. The analysis also indicated that it is not accurate to assign the same haplotype name if the sequences are not the same length.
Further comparison with the COI sequences from those deposited in Barcode of Life Data System (BOLD) showed that all the sequence obtained in this study are belong to BIN AAM9563, with over 98% identity. In contrast, the sequences shared over 94% and 82% to the sequences of the two other BINs, ADT6053 and AAK5312, respectively.
A total of 450 COII sequences (518 bp) were obtained from the 463 BMSB individuals (Additional file 1), 29 haplotypes were identified including 20 novel and 9 previously reported ones (Additional file 2). BLAST search showed that the new haplotypes did not share any identical sequences in the 518 bp overlap region with that of the previous reported sequences.
The geographical distribution of the identified COI and COII haplotypes are shown in Figure 1 and Figure 2. Of the identified haplotypes, H1 (61.9% of the total individuals) and h1 (61.7% of the total individuals) were predominant for COI and COII, respectively, and were detected in all the countries studied except Japan (Table 1 and Additional file 2). Haplotypes H3 (7% of the total individuals) and h3 (16% of the total individuals) were the second most predominant haplotypes detected in China, Austria, Chile, Hungary, Italy, Serbia, and Slovenia. In addition, haplotypes H8 and H48 for COI were only detected in Austria. The newly identified haplotypes were mainly observed in the native countries (China and Japan) except N47 (Slovenia) (Additional file 2). All the novel COII haplotypes identified were detected from the two native countries, China and Japan (Additional file 2).
Overall, high haplotype diversity was observed in China. The main haplotypes from China were H1 H33, H22 , H3 for COI and h3 and h1 for COII (Table 1). The predominant haplotypes from Japan were H45, N22 , H23 , N40 for COI and h11 for COII (Table 1). Outside of the native areas, low haplotype diversity was observed, and H1, H3 for COI, h1, h3 for COII were the main haplotypes detected in those countries. Only one haplotype of each (H1 and h1) was detected in Georgia, Romania, Turkey and the USA (Table 1).
COI-COII combined haplotypes of the BMSB populations
In total, 428 individuals were identified with both COI and COII sequences (Additional file 1), thus used for COI-COII combined haplotype analysis. The combined COI-COII haplotype analysis produced 59 haplotypes, in which only 5 were previously reported and 54 were novel (Additional file 2). All these newly identified haplotypes were detected in China and Japan except a single haplotype in Slovenia (N47h3). The predominant haplotype H1h1 (62.6%) was observed in all the countries except Japan (Additional file 2). The geographical distribution of the identified COI-COII combined haplotypes is shown in Figure 3. In the native countries of BMSB, high haplotype diversity was observed with 24 haplotypes in China and 32 in Japan, without haplotypes shared between the two countries (Additional file 2 and Figure 3). In comparison, out of the 32 haplotypes identified in Japan, 31 were uniquely detected in Japan, and one haplotype, H41h15 was shared with an individual from Hungary (Additional file 2). Similarly, 22 out of 24 haplotypes detected in China were unique, and two haplotypes (H1h1 and H3h3) were also predominantly shared with the BMSB samples from the BMSB- invaded countries (Additional file 2). In the invaded countries, H1h1 was the predominant haplotype, identified in more than 90% of the studied samples from most of the BMSB- invaded countries, including Chile, Georgia, Hungary, Italy, Romania, Turkey and the USA (Additional file 2).
Population genetic analysis based on the combined haplotypes of COI and COII
Japan and China had the highest haplotype diversity (Hd), with Hd values of 0.942 and 0.858, and nucleotide diversity (π) values of 0.00238 and 0.00327, respectively (Table 2). Outside of the native regions of BMSB, the highest haplotype diversity was observed in Austria (Hd = 0.686, π = 0.00206), Serbia (Hd = 0.556, π = 0.00095) and Slovenia (Hd = 0.514, π = 0.00115). In contrast, little to no haplotype diversity was observed in the BMSB samples collected from Chile, Georgia, Hungary, Italy, Romania, Turkey and the USA. Therefore, two genetic groups were defined based on the Hd values obtained from the haplotype analysis for further analysis: group A (Chile, Georgia, Hungary, Italy, Romania, Turkey and the USA) and group B (Austria, Serbia and Slovenia). It is noteworthy that in Hungary, 5 sampling sites were studied, of which at two sites no haplotype diversity was observed, while other three sites showed variable diversity with an Hd value from 0.038 to 0.5 and a π value from 0.00085 to 0.0017036, with an overall Hd value of 0.107 and a π value of 0.00028. It indicates that the invasion of BMSB in Hungary may have come from genetically distinct populations.
In neutrality test, the Fu's Fs statistic values were very low in the two native countries of BMSB, China and Japan, with -7.852 (p < 0.02) and -29.707 (p < 0.02) (Table 4) while for the BMSB- invaded countries, Fu's Fs statistic value was -1.174 (p <0.02) for group A (Chile, Georgia, Hungary, Italy, Romania, Turkey and the USA) suggesting that group A was under population expansion. In comparison, the haplotypes diversity was slightly higher with an average of 0.63 for group B (Austria, Serbia, and Slovenia), but a Fu's Fs values of 1.453 (p > 10).
The Principle Coordinates analysis (PCoA) using the FST values showed that there were at least three population clusters, namely China, Japan and group A (Chile, Georgia, Hungary, Italy, Romania, Turkey and the USA) (Figure 4, Additional file 3). The recent invasion in Slovenia showed genetic similarities to those from Hebei and Beijing provinces of China. The BMSB populations from Austria and Serbia were also closely related to the Chinese populations of Shanxi and Anhui. The population from the Chinese province of Hainan also showed close relationship with a population from the Japanese province of Akita.
Besides, the AMOVA (Analysis of molecular variance) showed that variation among the 12 populations contributed 71.26% while variation within population contributed 28.74%. The overall FST was 0.71 (p < 0.05), indicating that the genetic difference among populations was extremely high.
The haplotype network of the BMSB individuals further revealed the widespread occurrence of H1h1 and H3h3, except the population from Japan excluded, whereas all the other haplotypes were mainly detected in the native countries (Figure 5). The analyses showed that there were three ancestral lines found in this study namely h1, h3 and h11. Most of the other haplotypes mutated from these three lines with differences of several base pairs. Moreover, an interesting phenomenon was observed that some haplotypes (N3n3, N5n3, N4n4, N5n5) detected only in the Hainan population (China) was highly isolated and closer to Japanese populations rather than to Chinese populations. To further explore the distribution of the combined haplotypes, the combined COI and COII dataset from the present and the previous studies [19, 28] were analysed together and resulted in a total of 81 haplotypes. The haplotype network analysis (Figure 6) indicated similar genetic relationships as previously reported except that few BMSB specimens from Italy had close relationship with Japanese populations (Figure 6).