Source soils
Three soils of different origin or distinct Cd/Zn pollution levels were used: highly polluted (HP) soil, slightly polluted (SP) soil, and non-polluted (NP) paddy soils. The HP and SP soils, collected from the rice paddy fields near Hangzhou, Zhejiang Province, China (30°04′38.3″N; 119°24′39.7″E), have the same origin. This site has been primarily contaminated with Cd and Zn derived from local metal smelters that have been active for more than thirty years. The long history of smelter activity resulted in depositional gradients of HMs across the site, and the HP and SP soils that were collected depended on distance apart from the region of various smelters. The non-polluted paddy (NP) soil was collected from a paddy field in Jian City, Jiangxi Province, China (27°21′24″N; 114°22′46″E). The HP, SP and NP soils were subjected to similar water and fertilization regimes. Total Cd concentration in SP soil was below the detection limit and in HP and SP soils was 12.54 and 0.98 mg kg−1, which was higher than the environmental quality standard (0.4 mg kg−1; GB15618-2018). Physicochemical properties of the three soils were determined by standard procedures and are shown in Supplementary Table S5.
Bacterial strains used in hydroponic experiment
Five single bacterial strains, i.e. Sphingomonas sp. SaMR12, Burkholderia sp. SaZR4, Burkholderia sp. SaMR10, Burkholderia sp. SaZRH13, Burkholderia sp. SaCRH14, used in this study were selected from our Culture Collection of Sedum alfredii-associated bacterial isolates [54]. Sphingomonas spp. SaHR01 was isolated from the highly-polluted (HP) rhizosphere soil in the present study, according to our previously described protocols [55]. These six bacterial isolates share a similar phylogenetic relatedness with the core rhizosphere taxa common to the five Cd-accumulating plants (i.e. Sphingomonas, Burkholderia), which exhibit high resistance to Cd and Zn stress, and carry at least one potential plant-growth promoting trait, such as ACC deaminase activity, IAA production, siderophore production, and phosphate solubilization [54]. Thus, these bacterial isolates might be the ideal object to test the role of core microbiota in host fitness and metal accumulation.
Experimental design, plant growth and sample collection
We have characterized the overall and core root bacterial microbiomes of the five Cd-accumulating plants, Indian mustard (Brassica juncea L.) [56], Sedum alfredii [55], Solanum nigrum L. [57], and the Ganges and Prayon ecotype of Noccaea caerulescens [58], grown in three different soils using two sets of experiments; these plants exhibit distinct ability to accumulate Cd in their root and shoot tissues (Fig. S6). First, for the “overall bacterial community” experiment, seeds of the Cd-accumulating plants were surface sterilized, germinated and pre-conditioned in sterile potting media for three weeks. After preconditioning four uniform seedlings of each Cd-accumulating plant were transplanted to the rhizobox containing 5 kg of air-dried soil, which is designed to obtain separate and intact rhizosphere soil as previously described [59]. Control rhizoboxes without transplantation were maintained in parallel (i.e. unplanted soil). For each plant grown in a given soil, five biological replicates were established. Rhizoboxes were randomly distributed in trays and placed in a greenhouse under natural light, with an average day/night temperature of 26/20℃ and relative humidity of 70/85%. Plants and soils were watered periodically every two days to maintain the soil moisture at 55-65% of water-holding capacity. After four months of growth and soil microcosm incubation, plant shoot samples, root samples, rhizosphere and unplanted soils were collected from five replicated blocks as previously described [59]. Rhizosphere soil was collected as the wash containing the root-adhering soil layer after removing the excess soil attaching the roots according to our previous study [60].
Second, for the “active rhizosphere bacterial community” experiment, healthy S. alfredii seedlings with the uniform size were transferred to the rhizobox containing 3 kg of HP and SP soils, respectively. We chose S. alfredii because it has the strongest ability to extract Cd from the soils among the five Cd-accumulating plants and thus can represent the assembly pattern of the active rhizobiomes in other Cd-accumulators. The continuous 13C labelling initiated 45 days later when S. alfredii were in an active photosynthesis status, and the rhizosphere microbiota reaches a stable state. S. alfredii were labelled with 13CO2 (98 atom % 13C, Cambridge Isotope Laboratories, Inc., Andover, USA) between 9 am and 5 pm (8 h) for 10 consecutive days [61]. During the labelling period, the total concentration of CO2 inside the growth chamber was maintained at 350-400 µl l−1 by mixing with additional 13CO2. Simultaneously, parallel microcosms as controls were also established with 12CO2 labelling. At the end of labelling, the rhizosphere and unplanted soils were sampled as previously described.
Lastly, we conducted a hydroponic cultivation to examine the effect of core synthetic community (SynCom) on plant growth and Cd accumulation in Indian mustard, Sedum alfredii and Solanum nigrum. Four uniform and healthy Cd-accumulating plant seedlings were transferred into axenic nutrient solution either with or without 25 µM Cd(NO3)2. The nutrient solutions were inoculated with the SynCom consisting of six potential core rhizospheric bacterial isolates (based on the results of the first experiment) affiliated with Burkholderia sp. and Sphingomonas sp.. Treatments included Control (axenic nutrient solution), 25 µM Cd2+ treatment (Cd) (equivalent to the minimum value of available Cd concentration in highly polluted field soil), and Cd treatment plus the SynCom (Cd + SynCom). Preparation and inoculation of SynCom in hydroponic experiment was similar to our previously described methods [55]. SynComs were inoculated to a final OD600 of 0.01–0.05 (approximately 107 CFU ml−1 water solution). Each treatment had five independent replicates, and plants were harvested after 45 days of cultivation under a 16 h light cycle, 26/22℃ average day/night temperature and relative humidity of 70%. At harvest, the root and shoot tissues were sampled and were rinsed with tap water and thoroughly washed with deionized water. Afterwards, the roots were immersed in 20 mM Na2-EDTA for 15 min to remove the metal ions attached to the root surface. The root and shoot tissues were oven-dried at 105℃ for 1 h, then at 70℃ to a constant weight and their dry weights were recorded.
Determination of Cd in root and shoot tissues, H2O2 and superoxide anion
Dry root and shoot samples (~0.1 g) were digested with HNO3–HClO4 (5:1, v/v) in a microwave dissolver, and the digestion solution was filtered (0.45-µm pore size) and Cd concentration were determined using Agilent 5100 ICP-OES (Agilent Technologies, CA, USA). In hydroponic experiment, the concentration of H2O2 in roots was determined by H2O2–titanium (Ti) complex reaction [62] and concentration of O2•− was measured by methods with 2−naphthylamine−4−aminobenzenesulfonic acid [63].
DNA extraction and 16S rRNA gene amplicon sequencing
Microbial genomic DNA was extracted from the rhizosphere and unplanted soil samples using the PowerSoil® DNA Isolation Kit (MoBio Laboratories, Carlsbad, CA, USA) according to the manufacturer's instructions. Extracted DNA was subjected to quality-check using NanoDrop 2000 spectrophotometer (Thermo Scientific, Waltham, USA) and gel electrophoresis (1.5% agarose). Amplicon libraries were prepared from two independent PCR reactions. During the first PCR run, the bacterial 16S rRNA gene of V3-V4 region (~480 bp) was amplified using the Illumina iTags primer pairs of 338F (5’-ACTCCTACGGGAGGCAGCA-3’) /806R (5’-GGACTACHVGGGTWTCTAAT-3’) [64]. The PCR conditions were as follow: 95°C for 5 min, followed by 30 cycles at 95°C for 30 s, 53°C for 30 s, and 72°C for 30 s and a final extension at 72°C for 5 min. PCR reactions were carried out in triplicate. In the second PCR run, each sample was tagged with a unique dual-index barcode. Amplicons were purified (Beckman, USA), pooled in equimolar ratios and were combined into one pooled sample and subjected to Illumina Hiseq sequencing (San Diego, CA, USA).
RNA-SIP gradient fractionation
Soil total RNA was extracted from ~2 g of soil sample using the PowerSoil® Total RNA Isolation Kit (MoBio Laboratories, Carlsbad, USA). The extracted rRNA (~500 ng) was mixed well with cesium trifluoroacetate (CsTFA) to achieve an initial buoyant density (BD) of 1.790 g ml− 1 before ultracentrifugation at 130, 000×g for 65 h at 20℃. Fifteen RNA fractions (~380 µl for each) were fractionated according to a previous report [65], and the CsTFA BD of each fraction was measured by determining the refractive index. Then, RNA in each fraction was precipitated with isopropanol as described previously [65]. Precipitation from gradient fractions were washed once with 70% ethanol and re-eluted in 30 µl Tris-NaCl for subsequent determination of total RNA. The rRNA from each fraction of each sample was quantified in triplicate by real-time reverse transcription PCR in an Applied Biosystems7900HT real-time system (Carlsbad, CA USA) with primer Ba519f/Bac907r [65]. Selected density fractions of rRNA were reversely transcribed into complementary DNA using the PrimeScriptTM RT reagent kit (Takara, Dalian, China), and subjected to 16S rRNA gene sequencing similar to the aforementioned protocols.
16S rRNA sequence processing
A total of 4216175 high-quality sequence was obtained with a median read count per sample of 46846 (range: 24369-60990). Raw sequence data was processed using QIIME2 (v.2020.6) pipeline. The DADA2 method [66] was used to trim the primer pairs, perform quality control (QC) of sequences, and assemble quality-filtered reads into error-corrected amplicon sequence variants (ASVs), which represent unique bacterial taxa and reveal cryptic diversity. A minimum of 25 bp of overlap was required for the paired-end reads merging step. The script ‘data2 denoise-paired’ in QIIME2 was used to perform DADA2, which is a pipeline for detecting and correcting (where possible) Illumina amplicon sequence data. The QC process filtered any phiX reads and chimeric sequences. Taxonomic classification of the assembled ASVs was conducted by training the naïve Bayes classifier against the Silva 138 99% OTUs (Operational Taxonomic Units) full-length reference sequences. Next, using the R package ’phyloseq’ [67], we removed any ASVs without a bacterial phylum assignment, or assigned to archaea, or mitochondria. To facilitate downstream composition and differential abundance analyses, we applied a prevalence and abundance threshold for filtered ASVs, where taxa were retained only if they were found in 2% of samples (three samples) and at a frequency of 20 sequence reads per sample. The filtered ASVs comprised 85% of the total number of sequences. Subsequently, we normalized the counts of individual ASV in a sample by dividing the total counts of the filtered ASVs within that sample resulting in relative abundance. Without application of this threshold, we rarefied community profiles to 10000 sequences to eliminate the bias from sequencing depth. Alpha diversity index of inverse Simpson’s diversity was calculated using the rarefied ASV table in ‘phyloseq’ R package, and phylogenetic diversity (Faith’s pd) was calculated using ‘picante’ R package.
Defining the core rhizosphere microbiota
To detect the ecologically robust core microbial members, two methods were applied to identify the core rhizosphere microbiota for each Cd-accumulating plant [17]. We first performed differential abundance analyses with the DESeq2 package to find the ASVs that significantly enriched in the rhizosphere compared with the unplanted soil [68]. We then defined the core root microbiota for each Cd-accumulating plant as the rhizosphere-enriched bacterial ASVs found in more than 75% [11, 19] of a particular sample type (rhizosphere) across the three soils using the rarefaction-normalized abundance table. In parallel, the core microbiota members were also identified using the indicator species analysis with the ‘multipat’ function of the R ‘indicspecies’ package, with indval value > 0.6 and P < 0.05 considering as the strong indicator for an ASV. This method is used to find indicator species and species assemblages characterizing groups of niches [17, 69], e.g. distinct plants or rhizocompartments70, and combines a species relative abundance with its relative frequency of occurrence in the various groups of niches [69]. Next, we compared the core taxa identified by the two analyses, and those that were completely overlapped phylogenetically were referred to as the commonly occurring core rhizosphere microbiota and were retained for subsequent analyses.
Statistical analyses
Differences in alpha-diversity indices between rhizosphere and unplanted soil were determined by Wilcoxon rank tests. Testing for statistical differences in plant parameters, metal concentration, alpha-diversity indices and community similarities among plants or soil types was performed using analysis of variance (ANOVA) with the Tukey’s HSD post hoc test (n = 5 for each group). Hierarchical clustering analysis based on Bray-Curtis dissimilarities was used to visualize the differences in active rhizobiomes, and principal coordinate analysis (PCoA) of the weighted UniFrac distance to characterize the rhizobiomes of different Cd-accumulating plants in the package ‘ape’ in R [71]. Permutational multivariate analysis of variance (PERMANOVA) was performed to test the significance of the biotic (plant species and rhizocompartment) and abiotic (soil type and soil pollution) factors in bacterial community divergences using ‘adonis’ test in ‘vegan’ package [72]. Differences in bacterial community composition between 13C-labelled and 12C-labelled rhizosphere samples and between HP and SP soils in RNA-SIP experiment were tested by analysis of similarities (ANOSIM). The effects of plant species, soil type and rhizocompartment on the differential abundance of bacterial taxa were analyzed using DESeq2 [68], which fits negative binomial generalized linear models to count data (ASVs) and estimates their log2-fold change in abundance across one or more experimental factors.
To reduce rare ASVs in the data set, we only retained ASVs with relative abundance more than 0.01% of the total read count and were found in 20% of the compartment samples of each Cd-accumulating plant. Spearman correlation matrixes were calculated with the ‘WGCNA’ package [73], and all P-values were adjusted for multiple testing using Benjamini and Hochberg false discovery rate (FDR) controlling procedure. Only robust (Spearman’s ρ > 0.6 or ρ < -0.6) and statistically significant (p < 0.05) correlations were kept. The co-occurrence network complexity was evaluated by the topological parameters of average degree, betweenness centrality and assortativity according to a previous study [74], with higher average degree, and smaller betweenness and assortativity representing greater network complexity. The nodes with high average degree (> 50), high closeness centrality (> 0.2) and low betweenness centrality (< 5000) were defined as keystone taxa in co-occurrence networks that were visualized in Gephi [16, 75].
The main microbial predictors for root and shoot biomasses, and root and shoot Cd uptake associated with Cd-accumulating plants were identified by a classification Random Forest (RF) regression analysis using the ‘randomForest’ package [76]. In these RF models, ASVs from the filtered dataset were served as predictors for plant biomass and Cd uptake indexes. The importance of each predictor variable is determined by evaluating the decrease in prediction accuracy (that is, increase in the MSE (mean squared error) of variables: high MSE% values indicate more important variables) when the data for that predictor is randomly permuted. The significance of the importance of each predictor on plant biomass and Cd uptake, and the total variance explained by the predictors was assessed using the ‘rfPermute’ package [77]. Significance of the models was assessed with 5000 permutations of the response variable, by using the “A3” package. Afterwards, a linear regression model was performed to assess the RF analysis outcome, and to validate the relationships between the core bacterial taxa, plant biomass and Cd uptake.