Bacterial composition of Anopheles mosquitoes from West Africa
We extracted DNA from 665 individual Anopheles mosquitoes collected in Guinea (N = 584) and Mali (N = 81). To characterize the microflora of each mosquito, we amplified and sequenced the V2 variable region of the bacterial 16S ribosomal RNA genes (see Methods for details). We obtained a total of 8,467,703 sequences derived from 760 samples (665 mosquitoes and 95 extraction controls). On average, each mosquito samples yielded 11,730 sequences compared to 5,984 sequences on average per extraction control. We assigned these sequences to 21,527 amplicon sequence variants (ASVs, analogue of operational taxonomic units [36]), representing 37 phyla including Proteobacteria (6,692 ASVs accounting for 64% of all reads), Firmicutes (26%), Actinobacteria (6%) and Bacteroidetes (2%) (Fig. 1).
To investigate differences in bacterial composition among mosquitoes, we calculated β-diversity estimates using weighted UniFrac and Bray-Curtis dissimilarity matrices. Principal coordinate analyses (PCoA) conducted using Bray-Curtis dissimilarity or weighted UniFrac distances showed that the microbial composition separates mosquitoes into distinct clusters (Fig. 2A and Additional file 2: Figure S1A, respectively). These clusters appeared to group mosquitoes collected in the same sites (Fig. 2), and this observation held true when we restricted our analyses to mosquitoes only collected from sites in Guinea (Fig. 2B and Additional file 2: Figure S1B). The details of all the ASVs identified and their taxonomy are provided in Additional file 3: Table S2.
Assessment of mosquito species and kdr mutation
We simultaneously genotyped the same mosquitoes at loci informative of their species and insecticide-resistance status by high-throughput sequencing (see Methods).
Out of 665 mosquitoes, 551 (82.9%) were successfully genotyped for the S200 × 6.1[34] and cox1[39] loci. We primarily used the S200 × 6.1 locus to identify the species of each mosquito as i) this locus was more robustly amplified and sequenced than the cox1 locus (with an average read count of 2,917 and 1,181 per mosquito, respectively) and ii) provided clearer taxonomic resolution (with, for example, 233–234 (mean of 233.67) nucleotides differentiating the sequences from Anopheles gambiae s.s. from those of Anopheles coluzzii, compared to 0–4 (mean of 1.71) nucleotide differences using the cox1 locus) (Additional file 4: Table S3). However, the S200 × 6.1. locus systematically failed to yield sequences for some mosquitoes that were identified at Anopheles nili using the cox1 sequences. Overall, we identified that the mosquitoes analyzed belonged to five Anopheles species. 404 mosquitoes (74.5%) were identified as Anopheles gambiae s.s., while the remaining mosquitoes consisted of Anopheles coluzzii (61 mosquitoes or 11.3%), Anopheles melas (with 57 mosquitoes or 10.5%), Anopheles arabiensis (8 mosquitoes or 1.5%), and Anopheles nili (7 mosquitoes or 1.3%) (Fig. 3). We also identified 5 mosquitoes that were heterozygous for the S200 × 6.1 locus and likely represented F1 hybrids of An. gambiae s.s. and An. coluzzii species. The species distribution varied extensively between locations, with An. gambiae s.s. accounting for more than 90.00% of all mosquitoes collected in five out of six locations in Guinea, while An. melas was the most abundant species (79.2%, 57/72) in Boffa, a coastal region in western Guinea, and An. coluzzii predominated in Bandiagara, Mali (86.3%, 44/51) (Fig. 3).
Pyrethroid resistance is often due to a point mutation in the voltage gated sodium channel gene, described as knockdown resistance (kdr)[33]. 550 (82.7%) of the mosquitoes were successfully genotyped at this locus (with an average coverage of 2,436 reads per mosquito). In Guinea, with the exception of mosquitoes collected in Boffa, most mosquitoes (> 92.6%) were homozygous for the kdr-w (L1014F) alleles that is associated with resistance to pyrethroids [33] (Fig. 4). In Boffa, where An. melas is the predominant species, most mosquitoes were homozygous for the wild-type allele (L1014L). In Mali, the distribution was more heterogeneous, with roughly equal proportions of mosquitoes homozygous for the wild-type, resistant allele or heterozygous. Across mosquitoes, the genotype at the kdr-w locus correlated almost perfectly with the mosquito species, with An. gambiae carrying primarily L1014F alleles while An. arabiensis and An. melas were essentially wild-type. Only An. coluzzii showed high proportion of both alleles (Additional file 5: Figure S2). The details of all the genotypes and sequences amplified from each mosquito are provided in Additional file 6: Table S4.
Determination of the blood meal composition
To characterize the composition of the last blood meal of each these mosquitoes, we used the same DNA extract to amplify and sequence a short fragment of the mammalian mitochondrial 16S rRNA gene. 133 mosquitoes yielded > 1,000 reads and were considered blood fed in later analyses. 126 mosquitoes carried human DNA, 14 mosquitoes cow DNA, and 2 sheep DNA (Fig. 5 and Additional file 7: Table S5). Nine mosquitoes fed on more than one mammalian host species (Fig. 5). The blood meal composition differed between sites with, for example, 12 mosquitoes (20.1%) from Kankan that fed, at least partially, on cow while mosquitoes from all other sites, in Mali and Guinea, fed almost exclusively on human.
Identification of eukaryotic parasites and viruses from individual mosquitoes
Finally, we determined whether each mosquito carried a eukaryotic parasite and/or arbovirus using a sequencing-based assay recently developed in our laboratory [35]. After PCR amplification, and sequencing of DNA extracted from individual mosquitoes, we identified DNA sequences from eukaryotes and viruses from 127 mosquitoes, with on average, 1,876 reads supporting each identification in each mosquito. Nine out of 95 extraction controls (water samples that have been processed simultaneously) were also positive for one or more parasites, but with an average of 54 reads per parasite (Additional file 8: Table S6). The low read counts in those water controls could be explained by low-level cross-contamination during extraction, or sequence mis-assignment due to sequencing errors in the Illumina index sequences [35].
Table 2
Eukaryotic parasites and viruses identified from screening mosquito samples
Taxon targeted | Species identified (# positive) | % Identity |
Apicomplexa A | Theileria sp. (24) | 100 |
Apicomplexa B | Plasmodium falciparum (8) | 100 |
Apicomplexa C | Theileria sp. (3) | 100 |
Microsporidia | Parathelohania anopheles (38) | 92.47 |
Hazardia milleri (1) | 97.38 |
Culicospora magna (6) | 99.7 |
Microsporidium sp. 3 NR-2013 (34) | 97.01 |
Nematoda A | Acanthocheilonema viteae (12) | 100 |
Loa loa/Dipetolenma sp. (7) | 99.64 |
Setaria labiatopapillosa (11) | 100 |
Auanema rhodensis (1) | 98.21 |
Nematoda B | Setaria yehi/ Setaria digitate (4) | 99.72 |
Acanthocheilonema viteae (1) | 99.72 |
Dipetolenma sp./Filariodea sp. (3) | 98.94 |
Flavivirus | Anopheles flavivirus variant 2 (2) | 99.06 |
Anopheles flavivirus variant 2/ variant 1 (1) | 88.26 |
Culex flavivirus (1) | 99.06 |
Table shows the parasite and viral taxon targeted by each primer, the species identified and the number of mosquitoes positive for that species, and the percent match of the sequences amplified to that of the NCBI database.
Eight mosquitoes (1.2%), from 3 sites (6 of them identified as An. gambiae), yielded DNA sequences identical to Plasmodium falciparum, the primary cause of human malaria in Africa. We detect DNA belonging to Theileria species in a relatively high number of mosquitoes (27/665 or 4.1%). Theileria species are protozoan parasites that can be infective to domestic (i.e. cattle) and wild (buffalo) animals and transmitted by ticks [40]. Interestingly, from the fourteen samples for which Bos indicus (cow) was identified with greater than 500 reads, we detect Theileria species in twelve (86%), suggesting the tick-transmitted parasite may have been ingested by these mosquitoes during their last blood-meal. Seven mosquitoes yielded a DNA sequence identical to Loa loa and to several other filarial worms (and were further characterized as deriving from Mansonella perstans after sequencing of longer amplification products, M. Cannon. personal communication). Several DNA sequences were closely related to known parasites of mosquitoes themselves, belonging to microsporidia [41] (e.g., Parathelohania sp.), mosquito-transmitted nematodes (e.g., Setaria sp. [42, 43]), as well as two recently discovered Anopheles flaviviruses (e.g., Anopheles flavivirus variant 1 and variant 2)[44] (Table 2). The details of all parasite sequences amplified and their taxonomic information for each mosquito are provided in Additional file 8: Table S6.
Evaluation of the factors influencing microbial composition of wild mosquitoes
The characterization of the mosquito species, insecticide-resistance genotype, blood meal status and infection from the same mosquitoes that have been examined for their microbial diversity enables a rigorous assessment of the relative contribution of these factors to the microbial composition. Note that to avoid possible biases introduced by sample storage or DNA extraction, we restricted this analysis to mosquitoes collected in Guinea that were all processed identically and simultaneously. We implemented
an analysis of variance [45] to simultaneously test the contribution of each factor, while accounting for the others (multivariate analysis). The geographical location of the samples explained most of the variation in microbial composition (Adonis test, R2 = 0.200, ρ = 0.001), whereas the mosquito species (Adonis test, R2 = 0.004, ρ = 0.208), kdr-w genotype (Adonis test, R2 = 0.006 ρ = 0.438), and feeding status (Adonis test, R2 = 0.003, ρ = 0.173) were not statistically associated with the bacterial composition. The mosquito infection status was significantly associated with the microbial composition but had a very marginal effect (Adonis test, R2 = 0.015, ρ = 0.001) (Table 3). To further investigate the relative roles of geography and species that are confounded in this dataset, we examined PCoAs of the bacteria composition, restricting the analysis to i) all An. gambiae collected in seven sites (Additional file 9: Figure S3A) and ii) mosquitoes from all Anopheles species collected in Boffa (Additional file 9: Figure S3B). Together, these analyses validated the results of the statistical testing and confirmed that geographical location of the mosquitoes had a much greater influence on the bacterial composition than the species of the mosquitoes.
Table 3
Relative contribution of mosquito factors on microbial variation
Factor | Df | R2 | P-Value |
Location | 7 | 0.2 | 0.001 |
Mosquito species | 4 | 0.004 | 0.208 |
kdr-w genotype | 2 | 0.006 | 0.438 |
Blood-meal | 1 | 0.003 | 0.173 |
Infection status | 1 | 0.015 | 0.001 |
Residuals | 440 | 0.772 | |
Table shows, for each factor, the Df, R2 (percent of variation explained), and P-value (significance value) calculated by Adonis. Abbreviations: Df, Degrees of freedom