1. Sampling:
A two-day sampling campaign was conducted on June 10th and 11th in Benguerir, Rhamna, Morocco (N32°12’11.67”, W7°56’8.451”). The site features patchy vegetation that is primarily dominated by jujube shrubs, as shown in Figure 1A. Each patch represents a cluster with a diameter approximately ranging from 1 to 5 meters. Nine shrubs of Ziziphus lotus L. infected by Cuscuta epithymum were sampled, as illustrated in Figure 1B. The sample collection involved obtaining young stems (approximately 20 cm long) from various parts of each cluster, including non-infected stems, stems infected by Cuscuta, and Cuscuta stems alone (yellow to orange stems, Figure 1C). Additionally, a control sample was obtained from a non-infected plant situated away from the infection site (at the periphery of the cluster). All samples were placed in 20x30 cm Ziploc plastic bags, which were then placed on an ice pack and transported to the laboratory (African Genome Center, Benguerir, Morocco). In the laboratory, the non-infected Z. lotus stems were processed as follows: From each sample, two to three 5 cm long top stems were cut; disinfected with 70% ethanol, followed by a bleach solution; and rinsed three times using autoclaved water. They were then placed on autoclaved paper towels for drying. For infected samples (Figure 1D), only the fragments of Z. lotus coiled by Cuscuta stems were cut and subjected to the same disinfection, cleaning, and washing procedures as described above. Finally, Cuscuta stems were also disinfected, cleaned, and washed. The dried samples were stored in 15 ml containers at -20°C until further use.
2. DNA extraction and polymerase chain reaction (PCR) quantification
Total DNA extraction was conducted on 50 mg of lyophilized stem fragments from each sample type, including non-infected jujuba, infected jujuba, and Cuscuta alone. Samples were ground using a TissueLyser II and 2 mm Tungsten beads (QIAGEN, Global Diagnostic Distribution, Témara, Morocco) in 2 ml tubes for 15 minutes at a frequency of 24 Hz. The DNeasy Plant Pro kit (QIAGEN, Global Diagnostic Distribution, Témara, Morocco) was used to extract total DNA following the manufacturer's instructions. The quality and quantity of the extracted DNA were evaluated via gel electrophoresis and DNA quantification with a BioSpectrophotometer (Eppendorf, Hamburg, Germany).
3. PCR amplification
The V5–V6 hypervariable region of the 16S rRNA gene from each sample was amplified using the primer pair with custom sequence (CS) adapters at the 5’ end: CS1-719F/ ACACTGACGACATGGTTCTACA-AACMGGATTAGATACCCKG and CS2-1115R TACGGTAGCAGAGACTTGGTCT-AGGGTTGCGCTCGTTG[18]. PCR amplification was performed using the Platinum Direct PCR Universal Master Mix (ThermoFisher, Rabat, Morocco) in a final volume of 25 μL. Each PCR contained 1X of the PCR Universal Master Mix, 0.2 µM of each primer, and approximately 10 ng of genomic DNA. The PCRs were run in a thermocycler Mastercycler X50s (Eppendorf, Hamburg, Germany) following this program: initial denaturation at 94°C for 3 minutes, followed by 35 cycles consisting of denaturation at 94°C for 30 seconds, annealing at 55°C for 30 seconds, elongation at 72°C for 1 minute, and a final extension step at 72°C for 5 minutes before being held at 4°C. Each reaction, including negative controls with sterile Milli-Q water and positive controls, was carried out in duplicate.
4. Library Preparation and Sequencing
The bacterial 16S rRNA gene amplicon library preparation was generated using Agencourt AMPure XP beads (Beckman Coulter, USA) to clean the PCR products. Two ethanol washes were performed, followed by air drying. Purified PCR products were then resuspended in 10 mM Tris (pH 8.5). A second PCR was performed to attach Illumina sequencing adapters and index tags. PCRs for indexing contained 5 µL of purified PCR product, 2.5 µL of Fluidigm Access Array Barcode 384, and 1X KAPA HiFi HotStart ReadyMix (Roche Sequencing Solutions). The PCR volume was 50 µL per reaction. The PCR was run under the following conditions: an initial denaturation at 95°C for 3 minutes, followed by 8 cycles of denaturation at 95°C for 30 seconds, annealing at 55°C for 30 seconds, extension at 72°C for 30 seconds, and a final extension at 72°C for 5 minutes.
The indexed amplicons were subsequently purified using Agencourt ampure XP beads and quantified using a Qubit assay and the DNA HS kit (ThermoFisher, Témara Morocco). Library quantification, normalization, and pooling were performed following Illumina’s instructions. The bacterial 16S rRNA gene libraries were sequenced on an Illumina MiSeq sequencing instrument (Illumina, Paris, France) using a MiSeq reagent V3 kit (300 cycles of paired-end sequencing).
5. Bioinformatics analysis
The resulting fastq files were processed using R version 4.3.3 (the R Project for Statistical Computing). The quality profile of the reads was inspected using the DADA2 pipeline implemented in R [19], and the raw sequence reads with poor average quality scores (< 30) were discarded.
The bacterial reads were filtered and trimmed using DADA2 to eliminate primer and adaptor sequences. In addition, error rates for each consensus quality score were evaluated.
The denoised forward and reverse reads longer than 10 bp were merged into a multiple sequence alignment using the DECIPHER package, and amplicon sequence variances (ASVs) were obtained.
Taxonomic annotation was performed using the most updated and extensive SILVA database for bacteria for the resulting amplicon sequence variants (ASVs) [20, 21].
6. Statistical analyses
Bacterial alpha diversity was calculated through the Shannon and Simpson indices at the ASV level using the phyloseq package [22]. Beta diversity was assessed by computing the Bray–Curtis distance across different microbial taxa, and it was tested through PERMANOVA using the adonis command of the vegan package [23]. The command betadispers from the vegan package was also used to test the difference in dispersion between different conditions.
The hub taxa of the communities were determined using the SpiecEasi [24] and igraph package of R [25] by calculating the betweenness of each taxon. The betweenness represents the number of times an ASV is present on an edge connecting other ASVs. Thus, it indicates a probability that the organism corresponding to this ASV mediates interactions in the community.
Random forest models were created using the randomForest package of R [26] and 100 repeated trees. The best models created with this package were then evaluated in terms of the strength of their predictive power with respect to discriminating between conditions based on communities; the best predictor taxa were listed. Significant associations of taxa with respect to the conditions were calculated using the indicspecies package [27].