Environmental samples
As depicted in Fig. 1A, this study involved the utilization of two representative environmental samples, specifically, soil (n = 6) and activated sludge (AS) collected from full-scale wastewater treatment plants (n = 12, five used in PCT calculation) and a laboratory enhanced biological phosphorus removal (EBPR) bioreactor (n = 1). Detailed information pertaining to these samples is provided in Table S1. Soil samples were collected and transported to the laboratory with the elapsed time ranging from 1 h to 3 d, followed by sieving. In the case of AS samples, the time were ranging from 0.5 h to 1 d. Given the necessity to prepare in-situ media for these samples, a portion of the samples collected on the day of arrival was employed for media preparation. Inoculation onto agar plates were conducted approximately one day after the preparation of agar plates. The AS samples underwent cell dispersion using glass homogenizers, while soil samples were subjected to vortex (at maximum intensity for 5 minutes) to facilitate cell dispersion. For the inoculum, single-cellular filtrates were obtained by passing the dispersed suspensions through a 5 µm pore-size filter membrane (Isopore, Merck, Germany). Following the collection of these inoculate filtrates, fluorescence staining and microscopic cell counting were promptly carried out to determine cell density in the inoculums40. Subsequently, appropriate dilutions were made for plating based on the cell counting results. Microbial cells from approximately 50–100 mL of filtrate were harvested using a 0.1 µm pore-size membrane (mixed cellulose esters, Merck, Germany) as the bacterial community of the inoculums. Both the original samples and the filtrate inoculum from the soil and AS samples were stored at -20°C for subsequent DNA extraction.
Medium and culturing conditions for culturomics
For synthetic culture media, microbial cells from soil and AS were cultured on 1% strength Tryptic Soy Agar (TSA) and R2A agar media, respectively. Furthermore, we prepared in-situ media. For soil samples, 500 g of soil was introduced into 2 liters of an inorganic salt solution (K2HPO4 0.02 g L− 1, MgSO4 ·7H2O 0.02 g L− 1, KNO3 0.02 g L − 1, pH is adjusted to 7.0) and then stirred using a magnetic stirrer for 2 h. For AS samples, sewage containing AS was allowed to settle for 1 h before collecting the supernatant. In the case of the sample from the EBPR bioreactor, an additional medium was prepared as the artificial influent41. These suspensions underwent sequential steps of clarification, involving filtration through glass filter paper (average pore-size 2.7 µm glass filter papers, GF/D, Whatman, UK), 0.45 µm pore-size filter membranes, and 0.1 µm pore-size filter membranes (mixed cellulose esters, Merck, Germany). Subsequently, sterile filtrates were prepared using syringe filters (0.22 µm pore-size, Millex-GV, Merck, Germany). Considering a prior study that indicated the production of reactive oxygen species (ROS) inhibiting microbial growth when sterilizing agar mixed with phosphate-containing media24, the aforementioned synthetic and in-situ media were sterilized separately from the agar. Subsequently, autoclaved 6% agar (in ddH2O) was mixed in a 4:1 ratio with prewarmed (50℃) sterilized in-situ or artificial media before pouring. For plating, three or four dilution gradients (diluted with sterilized 0.85% NaCl solution) were established with ten replications per gradient. The incubation was conducted at 26°C for 4 weeks.
Throughout the incubation, various oxygenic conditions, including aerobic (AE, ambient atmosphere), anaerobic (AN, 80% nitrogen, 15% carbon dioxide, and 5% hydrogen in an anaerobic glovebox), and microaerobic (MI, similar to anaerobic but with 0.1–0.3% oxygen, monitored and adjusted within a glovebox), were applied. Not all media and oxygenic conditions were tested for all samples. Detailed cultivation conditions for each sample are presented in Table S1.
DNA extraction
The original soil and AS samples were subjected to DNA extraction using a kit, following the manufacturer's instructions (Fast DNA SPIN kit for Soil, MP Biomedicals, US). In the case of inoculum samples collected on filter membranes, they were aseptically cut into pieces and then extracted using the kit. For agar plates containing visible bacterial colonies (typically ranging from 100 to 800 colonies per plate), sterile cell scrapers were employed to collect the grown cells. To each plate, 3 milliliters of sterile 0.85% NaCl solution was added, and all visible colonies were scraped and transferred into two 1.5 mL centrifuge tubes. Following centrifugation at 10,000 g for 5 minutes, the supernatant was removed, and DNA extraction for the combined pellet was performed using the same kit. The extracted DNA was subsequently assessed for concentration and purity using a microspectrophotometer. In most cases, six agar plates for each culture condition and ≥ 3 technical (PCR-level) replications of both the original samples and the filter membrane samples were independently analyzed (detailed in Table S1).
High throughput sequencing and amplicon data analysis
The extracted DNA underwent barcoded PCR amplification, targeting the V4 region of the bacterial 16S rRNA gene, following the protocol detailed in the literatures42, 43. After purification, the mixed PCR products were sent to a commercial service provider (Novogene, China) for library preparation (n = 16) and sequencing using the PE250 strategy (Novaseq 6000, Illumina, US). For the clean data received from the service provider, denoising and ASV (specie-level threshold for V4 region proposed by Edgar44) picking were carried out using the USEARCH45 and VSEARCH46 platform. Taxonomic annotation for ASVs was conducted on the Mothur platform42, referring to the EzBioCloud database, which encompasses 16S rRNA gene sequences from type strains of described species and representative sequences of environmental clones47. When determining novelty of ASVs, only sequences from type strains were referred. Unless otherwise specified, 10,000 reads were subsampled per sample for subsequent analyses, with deeper sequencing depths employed for specific analyses. In total, 515 (394 culturomic samples) amplicon data were obtained in this study (Table S1).
Removal of contaminant ASVs in the amplicon datasets
Contaminant are widely occurred in high-throughput sequencing techniques, although low biomass samples are more likely impacted. Undoubtedly the contaminant ASVs will cause potential bias for our PCT calculation. Therefore, we employed the Decontam strategy to assess and identify contamination in the amplicon sequencing datasets within our experimental framework48. To evaluate potential contamination, we utilized a bacterial strain NS6-2 (classified as AF468245_g in Luteibaculaceae within the EzBioCloud database). NS6-2, which was originally isolated from Antarctic seawater and was not expected to be present in our samples, was chosen as the internal standard. Prior to PCR, we introduced the NS6-2 genome into the DNA extraction blank with three-fold dilutions (3×103, 9×103, 2.7×104, 8.1×104, 2.43×105, 7.29×105 per reaction) in triplicate. These internal standards were independently sequenced in two separate libraries containing our samples. In the internal standards of each library, non-NS6-2 ASVs were screened based on a Pearson correlation coefficient > 0.8 between their relative abundances and the dilution rates. Due to the potential for low-frequency cross-talking of the true ASVs from our samples to these internal standard samples49, we refrained from directly removing all these ASVs. For each ASV, if its relative abundance in the 3×103 internal standard sample (the average value of triplicates) exceeded the maximum relative abundance observed in all experimental samples from the same library, it was considered as contamination within our experimental system. All the contamination ASVs (listed in Table S2) were removed from the downstream analysis.
Metagenomes and MAGs annotation
We performed metagenomic sequencing on 6 inoculum samples and 17 culturomics samples derived from AS (detailed in Table S1) using the HiseqX10 sequencing platform (Illumina, US) with a PE150 sequencing strategy. Following stringent quality-control filtering, the metagenomic reads were assembled using MEGAHIT v1.2.950. Genome binning and primary dereplication was accomplished by utilizing metaWRAP v1.2.251. The genomic quality was evaluated using CheckM v1.0.12, leading to the exclusion of genomes with less than 50% completeness or over 10% contamination52. To reduce redundancy, metagenome-assembled genomes (MAGs) exhibiting an average nucleotide identity (ANI) of over 99% from the same sample source were further dereplicated. Taxonomic annotation of the MAGs was conducted utilizing GTDB-Tk v0.3.2 with GTDB-r20753. Open reading frames were predicted using Prodigal v2.6.354, and functional annotation as clusters of orthologous groups (COGs) was carried out using the eggNOG-mapper v255.
Determination of PCC and PCT
The determination of PCC followed established protocols as described in prior studies. Briefly, we quantified the cell density of the inoculum before plating. The number of colony-forming units (CFUs) was manually counted after a 4-week incubation period. PCC was calculated by dividing the number of CFUs by the number of inoculated cells spread on the plate. The same plates used for calculating PCC were also employed for PCT determination, allowing us to conduct the paired t-test on PCC and PCT values. To calculate PCT, the number of ASVs detected in the culturomic samples, which were also detected in the inoculum, was divided by the total number of ASVs detected in the inoculum. Once the ASV table was generated, PCT values could be calculated for each plate (plate-level), all plates for a specific condition (condition-level), and all culture conditions for a given sample (culturomics-level).
Isolation of strain EBPR01
In the amplicon data obtained from the culturomics of the EBPR reactor, we identified an ASV exhibiting high taxonomic novelty that was exclusively present under a specific culture condition (in-situ medium and anaerobic conditions), with a high relative abundance. Subsequently, we selected colonies from the remaining agar plates and perform streak purification on the same culture condition. Guided by sequencing the 16S rRNA gene, we isolated a strain named EBPR01 with the identical V4 sequence. The morphology of the strain was observed under a transmission electron microscope, and its draft genome was obtained through next-generation sequencing. The GTDB-Tk system classified it within an uncultured class of Bacteroidota, prompting us to select representatives of all orders in this phylum along with outgroup sequences for phylogenetic reconstruction based on the bacterial 120 universal genes56 using FastTree (v2.1.10) software57. In order to profile the ecological distribution of the EBPR01-represented lineage, we performed blastn search against the 16S database of the Earth Microbiome Project58. Only hits > 95% similarity to EBPR01was counted.
Measurement of metabolic activity for cultivable and uncultivable taxa under in-situ and laboratory conditions
To explore potential differences in the metabolic activity of taxa with cultivable and uncultivable phenotypes under both in-situ conditions (i.e., in intact bioflocs and the AS suspension) and laboratory conditions (dispersal and exposure to R2A broth, which simulates the isolation process), we utilized rRNA copies per cell as a metric for measuring metabolic activity for a given ASV. We conducted the following experiments. An AS sample (XM3) was collected and promptly transported to the laboratory within 0.5 hours. One portion of the untreated sample (referred to as "O") represented the in-situ status of the microbiome and was on site fixed with RNALater in triplicate. On one hand, we conducted R2A agar plate cultivation and sequencing to distinguish cultivable and uncultivable ASVs as described earlier. On the other hand, a fraction of the AS was subjected to dispersion treatment (homogenized by glass grinders and sieved through a 2000-mesh sieve (no filtration through membranes, in order to reduce operational time and minimize non-targeted stimuli to bacterial RNA to the greatest extent), with pore-size approximately 6 to 7 µm, retaining most cells in a single-cell form) for approximately 30 minutes. Subsequently, the untreated (U) and dispersed (D) AS samples were exposed to R2A (UR and DR) or 0.2 µm pore-size membrane-filtered in-situ water (UI and DI) in triplicate, with each suspension adjusted to ensure similar biomass levels between the U and D treatments (guided by a pre-experiment, OD600 ≈ 0.2), a 2 mL suspension was collected after 1 hour of treatment and fixed in RNALater. All fixed samples were stored at -80℃ before simultaneous extraction of RNA and DNA according to an established protocol for AS samples59. After extraction, RNA was immediately reverse-transcribed into cDNA (PrimeScript RT reagent Kit with gDNA Eraser, Takara Bio, China). These DNA and cDNA samples were PCR-amplified and sequences for their V4 region.
Absolute quantitative PCR (qPCR) was conducted to quantify the copy numbers of ribosomal RNA gene (rDNA) and ribosomal RNA (rRNA, in cDNA form) in the aforementioned samples. Assuming consistent extraction efficiency across different samples and taxa (at least no systematic bias for cultivable and uncultivable taxa within the same suspensions), and maintaining consistency in the experimental system and sampling volumes, the copy numbers of rDNA and rRNA in each extraction could represent the copy number density (i.e., the number of copies per milliliter) for different ASVs. A standard curve (R2 = 0.993) ranging from 104-108 copies per reaction was generated for absolute quantification using a cloned 16S rRNA gene from Escherichia coli as the standard template (in pMD19-T vector). This quantification was performed in triple technical replicates. Simultaneously, the amplicon sequencing data allowed us to calculate the proportions of each ASV relative to all bacteria. By multiplying these proportions by the total copy numbers, we obtained the copy number density of 16S rRNA and rDNA for each dominant ASV in each treatment.
To determine the rRNA copies per cell for the dominant ASVs (comprising over 0.1% in all rDNA datasets, which were normalized to 20,000 reads), the rDNA copy number for each ASV was normalized by the number of genomic ribosomal RNA operons (rrn) obtained from the rrnDB60. The resulting value of rRNA copies per cell for both cultivable and uncultivable ASVs served as a measure of their metabolic activity.
Comparative genomics guiding the promotion of cultivation of a specific taxon
Comparative genomes were tested for AS samples, comprising 6 inoculum and 12 corresponding culturomic metagenomes respectively (Table S1). To categorize MAGs into cultivable or uncultivable phenotypes (Table S3), the following steps were carried out. MAGs obtained from culturomic samples were designated as cultivable phenotypes. For inoculum samples, MAGs with marker genes (DNA gyrase subunit B and RNA polymerase subunit B) detected in the culturomic metagenomes (of the same sample) with > 1× coverage per 10 GB (based on read mapping) were excluded from subsequent analysis (i.e., putatively cultivable though the genome was not obtained). The remaining MAGs from the inoculum samples were classified as uncultivable. There were 257 uncultivable and 343 cultivable MAGs from the metagenomes. To mitigate potential false signals arising from the separated phylogeny of the cultivated and uncultivated MAGs, our investigation was restricted to the largest order in our genome catalog, Burkholderiales, which comprised 111 cultivable and 29 uncultivable genomes. This order is widely and dominantly distributed in global wastewater treatment plants61, 62.
We conducted a comparative statistical analysis of the presence/absence frequency of COGs, (requiring frequency > 5% among all MAGs) for cultivable and uncultivable MAGs within the order. The COGs that were enriched in either group (P < 0.05, Fisher's exact test) were examined. Based on the functional annotation of these COGs, our objective was to deduce potential conditions related to the uncultivability. Of particular interest was the significant underrepresentation of six vitamin B12 synthesis genes in the uncultivable group compared to the cultivable group. This observation prompted us to conduct cultivation experiments to validate whether the addition of vitamin B12 could promote the cultivable diversity of this order.
For the validation experiments, we used R2A medium and compared the outcomes with or without the addition of 5 µg/L of vitamin B12. It's worth noting that R2A medium contains yeast extract, which may contain ~ 0.5 µg/L of vitamin B12, according to Tao et al.63 Meanwhile, it has been reported that different microorganisms within a community exhibit selectivity for different subtypes of Vitamin B12, but high concentrations of Vitamin B12 can reduce or even eliminate subtype specificity64. The cultivation experiments were conducted on two AS samples (XM4 and ZZ1). Colonies from the agar plates were collected for sequencing and analysis. Due to differences between the taxonomic annotation system of 16S rRNA genes and the GTDB-Tk system (where Rhodocyclaceae is a family of Burkholderiales), we assessed the richness and relative abundance of ASVs affiliated with both Burkholderiales and Rhodocyclales (referring to EzBioCloud database) in different treatments. Both orders were highly diverse in the original and culturomic samples.
Data visualization
R package dplyr was using for data manipulation65. Plots were generated using R packages of ggplot2 and pheatmap66, 67.
Data availability
Sequencing data have been deposited at NCBI under PRJNA1028341 (amplicons, metagenomes, and the draft genome of EPBR01) and PRJNA1028348 (all MAGs). The sample names can be linked to Table S1. For reviewing, the data can be viewed by using the following account in NCBI (account: [email protected], using the link of Microsoft, PW: 917321022.hcl.)