Study areas and sampling
A total of eight photic-layer, four OMZ, and seven bathypelagic samples were taken during different oceanographic cruises in several sampling stations distributed along a wide range of latitudes (Fig. 1). Photic-layer samples (Table 2) were collected in the Atlantic and Indian Oceans during the Tara Oceans expedition in 2009–2013 [47], and from the Arctic Ocean during the ATP09 cruise in 2009 [48]. Additionally, surface seawater samples from the Blanes Bay Microbial Observatory (BBMO, http://www.icm.csic.es/bio/projects/icmicrobis/bbmo) in the NW Mediterranean Sea were collected in May 2015. OMZ samples (Table 2) were taken from the Indian and the Pacific Oceans also during the Tara Oceans expedition in 2009–2013 [47]. Bathypelagic samples (Table 2) from the Atlantic Ocean at ~ 4000 m depth were taken from six different stations during the Malaspina 2010 Circumnavigation Expedition [49]. One of the stations sampled was located in the North Atlantic, whereas the other five stations were located in the South Atlantic; one of them was particularly placed in the Agulhas Ring, where deep waters from the South Atlantic converge and mix with Indian Ocean deep-water masses [50]. In addition, one bathypelagic sample was collected at 2000 m depth in the NW Mediterranean during the MIFASOL cruise in September 2014.
Table 2
Characteristics of the different samples used for isolation of bacteria. Non-redundant isolates stand for the number of different isolates remaining after removing those that were 100% identical in their partial 16S rRNA gene.
Cruise | Station | Sampling date | Oceanic location | Latitude | Longitude | Depth (m) | In situ temperature (ºC) | Nº of sequenced isolates | Nº of non-redundant isolates |
Tara Oceans | ST 39 | March 2010 | Indian Ocean | 19º 2.24’ N | 64º 29.24’ E | 5.5 | 26.2 | 104 | 25 |
ST 39 | March 2010 | Indian Ocean | 18º 35.2’ N | 66º 28.22’ E | 25 | 26.8 | 243 | 53 |
ST 39 | March 2010 | Indian Ocean | 18º 43.12’ N | 66º 21.3’ E | 268.2 | 15.6 | 88 | 18 |
ST 67 | September 2010 | South Atlantic | 32º 17.31’ S | 17º 12.22’ E | 5 | 12.8 | 115 | 49 |
ST 72 | October 2010 | South Atlantic | 8º 46.44’ S | 17º 54.36’ W | 5 | 25 | 71 | 33 |
ST 76 | October 2010 | South Atlantic | 20º 56.7’ S | 35º 10.49’ W | 5 | 23.3 | 89 | 27 |
ST 151 | March 2012 | North Atlantic | 36º 10.17’ N | 29º 1.23’ W | 5 | 17.3 | 76 | 33 |
ST 102 | April 2011 | Pacific Ocean | 5º 16.12’ S | 85º 13.12’ O | 475.6 | 9.2 | 97 | 15 |
ST 111 | June 2011 | Pacific Ocean | 16º 57.36’ S | 100º 39.36’ O | 347.1 | 10.9 | 98 | 35 |
ST 138 | December 2011 | Pacific Ocean | 6º 22.12’ N | 103º 4.12’ O | 444.9 | 8.2 | 79 | 34 |
ATP | AR_1 | June 2009 | Arctic Ocean | 78º 20.00’ N | 15º 00.00’ E | 2 | 6.2 | 13 | 9 |
AR_2 | June 2009 | Arctic Ocean | 76º 28.65’ N | 28º 00.62’ E | 25 | -1.2 | 20 | 9 |
Malaspina | ST 10 | December 2010 | North Atlantic | 21º 33.36’ N | 23º26’ W | 4002 | 2 | 20 | 9 |
ST 17 | February 2011 | South Atlantic | 3º 1.48’ S | 27º 19.48’ W | 4002 | 1.7 | 93 | 24 |
ST 23 | August 2011 | South Atlantic | 15º 49.48’ S | 33º 24.36’ W | 4003 | 1.5 | 94 | 39 |
ST 32 | January 2011 | South Atlantic | 26º 56.8’ S | 21º 24’ W | 3200 | 2.5 | 39 | 16 |
ST 33 | January 2011 | South Atlantic | 27º 33.2’ S | 18º 5.4’ W | 3904 | 1.7 | 5 | 5 |
ST 43 | April 2011 | South Atlantic | 32º 48.8’ S | 12º 46.2’ E | 4000 | 1.2 | 4 | 4 |
MIFASOL | ST 8 | September 2014 | NW Mediterranean | 40º 38.41’ N | 2º 50’ E | 2000 | 13.2 | 127 | 36 |
BBMO | IBSURF | May 2015 | NW Mediterranean | 41º 40’ N | 2º 48’ E | 5 | 17.7 | 86 | 43 |
In each of these stations, seawater was collected using Niskin bottles attached to a rosette sampling system, except at BBMO, where samples were collected with a bucket. Seawater was sequentially filtered through 200 ∝m and 20 ∝m meshes to remove large plankton cells and to keep the free-living bacterial community together with the one attached to particles (< 20 ∝m). Two samples of each station were kept in 2 ml Eppendorf tubes with DMSO 7% final concentration and stored at -80ºC until further processing in the laboratory.
Geographical coordinates of stations, sampling date, sampled depth, in situ temperature and total number of sequenced isolates are listed in Table 2.
Culturing and isolation
Isolates were obtained by plating 100 ∝l of undiluted and 10x diluted seawater from the photic, OMZ and bathypelagic samples, in triplicates, onto Zobell agar plates (i.e. 5 g peptone, 1 g yeast extract and 15 g agar in 750 ml of 30 kDa filtered seawater and 250 ml of Milli-Q water) or Marine Agar 2216 (Difco™) plates, which is based also on the Zobell medium formulation [51]. Our medium culturing strategy was only focused to retrieve heterotrophic marine bacteria that could grow easily under laboratory conditions (nutrient rich medium, standard oxygen concentrations and atmospheric pressure) using two similar culturing media. The only difference between Zobell agar and Marine Agar 2216 plates is the use of natural seawater (Zobell agar), or the addition of the minerals and salts contained in natural seawater to distilled water (Marine Agar 2216). Indeed, we did not observe significant differences (Fisher test analyses, data not shown) in the bacterial isolation between both media.
Photic-layer and OMZ samples were incubated at room temperature (RT, ∼20ºC) while bathypelagic samples were incubated at their in situ temperature, which ranged from ∼4ºC (in the Atlantic Ocean at 4000 m depth) to 12ºC (NW Mediterranean at 2000 m depth) (Table 2 and Supplementary Table S8 in Additional file 2), but also at RT in order to assure bacterial recovery from all stations. In all cases, triplicates of each temperature condition and dilution were incubated in the dark until no more colonies appeared (10–30 days).
A total of 1561 bacterial isolates were randomly selected for DNA amplification and partial sequencing of their 16S rRNA gene (Table 2 and details below). Similar number of isolates were sequenced from photic layers (817; average: 102 isolates per station) and from deep oceans (744; average: 67 isolates per station) with 362 isolates from the OMZ and 382 from the bathypelagic. In most of the bathypelagic samples we collected all colonies appearing in the plates, which ranged from 6 to 129 including all replicates. Colonies were streaked on agar plates in duplicate to ensure their purity and avoid contamination. The isolates were stored in the broth medium used with glycerol 25% in cryovials at -80ºC.
PCR amplification and sequencing of the 16S rRNA gene
Available DNA used for template in Polymerase Chain Reaction (PCR) was extracted from 200 ∝L of isolates liquid cultures placed in 96 well plates, diluted 1:4 and heated (95ºC, 15 min) to cause cell lyses. The partial 16S rRNA gene sequences were PCR amplified using bacterial primers 358F (5’-CCT ACG GGA GGC AGC AG-3’) [52] and 907Rmod (5’-CCG TCA ATT CMT TTG AGT TT-3’) [53]. The complete 16S rRNA gene was amplified in specific strains after DNA extraction using the DNeasy Blood & Tissue kit (Qiagen), following the manufacturer’s recommendations, and using the modified primers from Page et al. [54] 27F (5’- AGR GTT TGA TCM TGG CTC AG -3’) and 1492R (5’- TAC GGY TAC CTT GTT AYG ACT T -3’). Detailed PCR conditions are described in Supplementary methods (Additional file 1). Purification and OneShot Sanger sequencing of 16S rRNA gene products was performed by Genoscreen (Lille, France) with primer 358F for partial sequences, and with both 27F and 1492R for complete sequences. ChromasPro 2.1.8 software (Technelysium) was used for manual cleaning and quality control of the sequences.
Data processing and taxonomic classification
The 16S rRNA sequences of our cultured strains were clustered at 99% sequence similarity [44] in order to define different operational taxonomic units (iOTUs or isolated OTUs) and construct iOTU-abundance tables for the different stations and layers studied (Supplementary Table S6 in Additional file 2) using UCLUST algorithm from the USEARCH software [55]. The different iOTUs were taxonomically classified using the lowest common ancestor (LCA) method in SINA classifier [56], using both SILVA (release 132 in 2017) and RDP (Ribosomal Database Project, release 11) databases. Parallelly, isolates sequences were submitted to BLASTn [57] with two subsets of the RDP database, one including only the uncultured bacteria (Closest Environmental Match, CEM), and another including only the cultured bacteria (Closest Cultured Match, CCM) in order to extract the percentages of similarity with both datasets (Supplementary Tables S9 and S10 in Additional file 2), and to assess whether our isolates were similar to effectively published cultured organisms.
Additionally, a more restrictive clustering at 100% sequence similarity (USEARCH software) was also used to define iOTUs and to detect how many bathypelagic, OMZ and photic-layer bacterial isolates were identical, and thus, to identify bacterial taxa or strains that could distribute along all the water column. Such comparisons were done with: (i) photic and OMZ isolates sequences retrieved from ST39 vertical profile and (ii) the whole isolates dataset.
Phylogenetic analyses
A phylogeny was inferred for the representative isolates of each iOTU defined at 99% and 100% sequence similarity. The closest sequence to each isolated iOTU in SILVA v.132 database was found and collected using BLASTn [57]. Alignment of the isolates and reference sequences was performed with MUSCLE from the Geneious software v.11.0.5 [58]. The alignment was trimmed to the common 16S rRNA gene fragment covered by both sets of sequences. Phylogeny was constructed using maximum-likelihood inference with RAXML-NG 0.9.0 [59] and the GTR evolutionary model with optimization in the among-site rate heterogeneity model and the proportion of invariant sites (GTR + G + I), and 100 bootstrap replicates.
Eventually, some isolates among our culture collection presented partial 16S rRNA sequences with a percentage of similarity below the 97% with public databases. In this case, the complete 16S rRNA gene was sequenced for one isolate, with which two more strains clustered at 100% similarity, and phylogenetic tree was constructed to support its putative novelty. The tree included their complete and partial sequences of the 16S rRNA gene, their best hits from uncultured and cultured microorganisms, extracted from local alignments against RDP 11, SILVA LTP (Living Tree project), and National Center for Biotechnology Information (NCBI) databases, and the reference 16S rRNA genes from their related genera. Details on the phylogenetic tree constructions are explained in Supplementary methods (Additional file 1).
Comparisons between layers and statistical analyses
All data treatment and statistical analyses were conducted in the R statistical software version 3.4.3 [60] and packages stats, vegan version 2.5-3 [61], ape version 5.1 [62], picante version 1.6-2 [63] and EcolUtils [64]. In general, analyses were performed using the non-rarefied iOTU-abundance tables, but for specific analyses such as the detection of iOTUs present along all the water column, the iOTU-abundance table constructed with the sequences clustering at 99% was sampled down to the lowest sampling effort (362 isolates in the OMZ). In this manner, the rarefied or subsampled iOTU table was obtained using the function rrarefy.perm with 1000 permutations from the R package EcolUtils [64].
Rarefaction curves were performed with the package vegan to estimate the sampling effort in each studied layer. We also calculated bacterial richness/diversity metrics from each depth using two approaches: an OTU-based approach (i.e. considering the iOTUs as unrelated biological entities), and a phylogenetic approach (i.e. considering the evolutionary relationships among iOTUs with the complete computed phylogeny). The number of iOTUs, the Chao extrapolative richness estimator [65] and the Shannon entropy index [66] were computed as OTU-based metric using the non-rarefied iOTU abundance table, while the Faith’s phylogenetic diversity (PD) [67], the PD divided by the number of iOTUs (PD/iOTUs), and the mean nearest taxon distance (MNTD) [68] were used as phylogenetic measures for diversity. Differences between photic, OMZ and bathypelagic for richness/diversity measures were tested using ANOVA test followed by the Tukey’s post hoc test, as data normality was assured. To assess significance, the statistical analyses were set to a conservative alpha value of 0.01.
The coverage (C) for each of the depths was also calculated by the equation
where N is the number of iOTUs being examined and n1 represents the number of iOTUs occurring only once or singletons [69].
Comparison to environmental 16S rRNA Illumina sequences
Isolates were compared to zOTUs (zero-radius OTUs, i.e. OTUs defined at 100% sequence similarity) denoised [55] from high-throughput sequencing (HTS) of the 16S rRNA sequences (16S iTAGs) obtained from Tara Oceans and Malaspina Expedition datasets which covered surface, mesopelagic and bathypelagic layers. Further description of those datasets, sample collection, DNA extraction, sequence processing and data treatment are described in Supplementary Methods (Additional file 1). All isolates sequences were compared to zOTUs sequences at 100% similarity respectively, by running global alignment using the usearch_global option from the USEARCH v10.0.240 [55]. The results were filtered by coverage of the alignment at 100% and in those cases where isolates had more than one hit, only the ones with the higher percentage of identity were kept. Primers used to obtain the 16S rRNA genes of the isolates were different from the ones used to obtain the 16S rRNA iTAGs, but both amplified the V4 and V5 hypervariable region of the gene, so comparisons could be done by this method. For each dataset compared we calculated the mean percentage of reads or iTAGs, and zOTUs of the bacterial community, that matched at 100% similarity with the 16S rRNA sequences of the strains isolated by traditional culture techniques. These percentages were calculated from the rarefied zOTU-abundance tables.
Nucleotide sequences accession number
The 16S rRNA gene sequences of the bacterial isolates retrieved in this study were deposited in GenBank. Sequences from all isolates, except those coming from the OMZ regions and those from the surface Indian Ocean, were deposited under the following accession numbers MH731309 - MH732621. Notice that among these accession numbers other isolates are included, originated from the same locations but isolated with another culture medium and not included in this study. Isolates retrieved from the OMZ and those from the surface Indian Ocean are deposited under the following accession numbers MK658870-MK659428.