Cheese samples and experimental design
The present work explores the microbial community associated with a typical raw goat milk cheese. Caprino Nicastrese cheese is employed as the reference sample kind of a traditional raw goat milk cheese. Following collection, the goat raw milk is coagulated for 60 minutes at 36°C using 0.4 g/L goat rennet and without the addition of any starter culture. The resulting curd is manually cut into rice-sized pieces and formed and stored at room temperature for 48 hours to drain out the residual whey. Cheese wheels are then salted for 24 hours in brine with 30% (w/v) NaCl. Finally, the cheese is ripened in wooden axes in the storage basement of the cheese farms at 10–15°C and 70–85% humidity.
Samples from the surface and inner mass (i.e. rind and core, respectively) of the cheese wheels are aseptically collected with a sterile knife from 30, 60 and 90 days-ripened cheese wheels. Biological replicates were sampled as defined in Fig. 1 and brought on ice to the laboratory for the subsequent isolation and analysis of the harboured microbiota.
Bacterial fraction enrichment
To avoid alterations of the microbiota composition and/or activity, all the steps of the bacterial fraction enrichment were performed at 4°C, keeping under strict control the temperature. Briefly, independent aliquots of 0.5 g of each biological replicate per sample kind were finely grated and homogenized with 15 mL buffer containing 50 mM Na2HPO4, 0.1% Tween 80, pH 8.0. Samples were then shaken on an orbital shaker at 1600 rpm for 10 minutes. Following, the samples were centrifuged for 20 minutes at 2500× g. The supernatant was collected in a new tube and subjected to four more rounds of shaking/centrifuge/resuspension; whereas the pellet from each step was gently resuspended and collected in a single clean “pool vial”. The “pool vial” was finally centrifuged at 12,000× g for 20 min, resulting in a bacterial pellet collected from an original amount of 0.5g cheese aliquots [1, 12, 13].
The enriched bacterial fractions represented the common starting point for the two analytical approaches employed in the present study: 16S rRNA gene sequencing and metaproteomics (Fig. 1).
16S rRNA gene sequencing and metataxonomic analysis
DNA extraction and library preparation. Cheese DNA was extracted from 9 rind and 9 core samples, respectively, 3 for each ripening time point (Fig. 1) according to EZ1 DNA Tissue protocol (Qiagen, Germany). Starting from 40 mg, 190 µl of Buffer G2 and 10 µl of proteinase K solution were added to each sample aliquot, incubating at 56°C in an Eppendorf® Thermomixer until complete sample lysis and vortexing 2–3 times per hour to disperse the sample. Two hundred µl of supernatant were transferred to a new 2 ml sample tube and the automated EZ1 extraction was finalized. Amplification of the V3–V4 variable region from the bacterial 16S rRNA gene (∼460 bp) was carried out using the primers 16S_F 5′-(TCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CCT ACG GGN GGC WGC AG)-3′ and 16S_R 5′-(GTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GGA CTA CHV GGG TAT CTA ATC C)-3′, according to MiSeq rRNA Amplicon Sequencing protocol (Illumina, San Diego, California, USA). The PCR reactions were set up using the 2× KAPA Hifi HotStart ready Mix kit (KAPA Biosystems Inc., Wilmington, Massachusetts, USA). DNA amplicons were cleaned-up by CleanNGS kit beads (CleanNA, Coenecoop 75, PH Waddinxveen, Netherlands). A second amplification step was performed to obtain a unique combination of Illumina Nextera XT dual-indices for each sample. The final libraries were cleaned-up using CleanNGS kit beads, quantified by Quant-iT PicoGreen dsDNA Assay Kit (Thermo Fisher Scientific, Waltham, Massachusetts, USA) and normalized to 4 nM. To generate paired-end 250x2 bp-length reads, normalized libraries were pooled together and run on the Illumina MiSeq platform, according to manufacturer's specifications.
Biocomputational and Statistical Analysis for cheese microbiota Profile Analysis
QIIME2 was used to analyze the Paired-end sequencing reads [14]. Quality control, denoising, chimera removal, trimming and construction of the Amplicon Sequence Variant (ASV) table were performed by the means of DADA2 implemented as a plugin in QIIME2 [15]. The taxonomy was assigned by using a Naive Bayes model pre-trained on SILVA through the QIIME2 plugin q2-feature classifiers [16]. Alpha- and beta-diversity were computed by skbio.diversity using analysis of variance (ANOVA test) and permutational analysis of variance (PERMANOVA test), respectively; the latter was applied on phylogenetically informed weighted and unweighted Unifrac and Bray-Curtis distance matrices [17] with 9999 permutations to perform paired comparison of rind and core samples at different time points. Principal coordinate analysis (PCoA) plots were used to illustrate the beta diversity of samples. The ASV table was normalized using the Cum Sum Scaling (CSS) methodology [18], hence Kruskal-Wallis test was applied to compare taxonomic differences at phylum level (L2), family (L5), and genus (L6) level. Python 3.7 was used to perform ecological statistical analyses. Three different levels of statistical significance was identified based on different p-values (p ≤ 0.001) and false discovery rate (FDR) thresholds (p ≤ 0.05, p ≤ 0.001) [19]. Phylogenetic Investigation of Communities by Reconstruction of Unobserved States (PICRUSt) [20], exploiting the Kyoto Encyclopedia of Genes and Genomes (KEGG) orthologs (KO) database was used to determine ASVs and their microbiome functions. In addition, LEfSe (Linear discriminant analysis Effect Size) were independently used to determine the features most likely explaining differences between rind and core of the cheese wheel at 30-, 60- and 90-day ripening times.
Metaproteomics analysis
Metaproteome extraction and quantification
Bacterial pellets from the above protocol of bacterial fraction enrichment were resuspended in protein extraction buffer (7M UREA, 2M Thiourea, 4% CHAPS) and subjected to 6 cycles of 1 min bead beating (Minilys, Bertin Tecnologies, FR) interspersed by 1-minute rest on ice. Bead beating steps were performed by shaking each sample at 4000 rpm with an equal amount (1:1 v/w) of 0.1 mm zirconium-silica beads. Following bead-beating, the samples were heated up to 60°C for 10 minutes and centrifuged for 20 min at 12,000× g at 4°C. The supernatant containing the extracted metaproteome was collected in a clean tube and further processed for the metaproteomic analytical workflow.
Extracted proteins were quantified with the Bio-Rad Protein Assay Dye Reagent Concentrate (Bio-Rad, Hercules, USA) following the manufacturer’s instructions. Approximately 50 µg of the extracted proteins were precipitated by incubation (30 min at 4°C) with precooled 20% trichloroacetic acid (TCA) and kept for further processing.
Trypsin Digestion and Mass Spectrometry Analysis
Precipitated proteins were digested in solution. Briefly, 50 µg of total proteins for each sample was treated for disulfide bond reduction with 10 mM DTT for 1 h at + 37° C and alkylated with 20 mM IAA at + 37° C for 1 h in the dark. Iodoacetamide excess was removed by incubation of the sample with 1.61 mM DTT at + 37° C for 20 min. Sample digestion was carried out overnight at + 37° C using trypsin in 1: 50 (w/w) ratio with respect to the protein content. Enzymatic digestion was stopped by addition of 0.1% FA (v/v).Tryptic peptides were purified and desalted by using self-assembled C18 Stage Tips [21]. Tips containing the C18 membranes with the bounded peptide mixture were eluted with 5% acetonitrile (5% ACN/ 0.1% TFA), dried at the vacuum centrifuge, and stored at -20°C until mass spectrometry measurements.
Prior MS/MS measurement, the dried peptide mixture was suspended in 0.1% FA and loaded onto a precolumn Acclaim PepMap100 C18, 5 µm, 100 Å, 300 µm i.d. x 5 mm (Thermo Scientific, San Jose, CA). Following 5 minutes of trapping, operating at 10 µL/min in eluent A, peptides were separated by a column Easy-Spray PepMap C18 (2 µm 100 Å 15 cm x 50 µm ID) with a Thermo Scientific Dionex UltiMate 3000 RSLC nano system (Sunnyvale, CA).
Analyses were performed using aqueous solution of FA (0.1%, v/v) as eluent A and ACN/FA (99.9 : 0.1, v/v) as eluent B in the following gradient elution: (i) 5% of eluent B (7 min), (ii) from 5 to 35% of eluent B (113 min), (iii) from 35 to 99% of B (15 min), (iv) 99% of B (10 min), (v) from 99 to 5% of B (2 min), (vi) 5% of B for column conditioning (13 min). The column was kept at 35°C and operated at a flow rate of 300nL/min; the injection volume was set at 5.0 µL.
Peptides were directly eluted into Orbitrap Elite nanoESI-MS/MS (Thermo Fisher Scientific, Waltham, Massachusetts, USA). Tandem mass spectrometry measurements were performed in positive Full Scan acquisition mode in the 350-2,000 m/z range and with a resolution power of 60,000. The nanoESI tuning parameters were capillary temperature 250°C, source voltage 1.5 kV, sheath gas 0, auxiliary gas 0, and S-lens RR level 50%. MS/MS analyses were performed in data-dependent scan (DDS) mode by selecting and fragmenting the twenty most intense multiple-charged ions of the collected Full Scan spectra by using collision-induced dissociation (CID, 35% normalized collision energy) with a resolution power of 60,000. Only precursors with a charge state 2–7 and intensity above the threshold of 5x103 were collected for MS/MS. The DDS set parameters were the following: exclusion mass width relative reference mass in the low and high range 10 ppm, minimum signal threshold (counts) 500, default charge state 2, activation time 10 ms [22].
Bioinformatics data analysis and data integration
Protein identification and quantification
MS raw spectra were processed through Proteome Discoverer and MaxQuant software following a two-step database-dependent search (DDS) approach as reported previously [23]. Briefly, raw files were, at first, processed by Thermo Proteome Discoverer software (v.2.2) and searched against the UniProt KB bacteria database. Methionine oxidation was set as variable modification and carbamidomethylation of cysteine as fixed modification. The SequestHT nodes thresholds were set to “Automatic”, and a filter considering only entries with at least one peptide per protein was chosen. All other filters and settings of the software were kept as default, including protein grouping with peptide confidence set on “high” and delta Cn of 0.1. The Percolator node supporting a strict maximum parsimony principle was activated with a false discovery rate of 1%.
The first DDS enable the assessment of the microbial community composition at the family level, leading to the construction of a smaller in-house database accounting for the bacterial families identified in both the metaproteomics and 16S rRNA gene sequencing-based investigations. The customized database was employed in the second DDS of the MS raw data performed onto MaxQuant (v 1.6.17.0) set on LFQ modality for peptide identification and protein inference and quantification. Cysteine carbamidomethylation was set as fixed modification and methionine oxidation as variable modification. Two missed cleavage sites were allowed for in silico protease digestion and peptides had to be fully tryptic. All other parameters of the software were set as default, including peptide and protein FDR < 1%, at least 1 peptide per protein, precursor mass tolerance of 4.5 ppm after mass recalibration and a fragment ion mass tolerance of 20 ppm. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD032280.
Ecological and functional characterization of the microbiota by metaproteomics
Information on the taxonomic composition of the microbiota, as assessed by the identified protein repertoire, was gathered from the protein annotation of the UniProt KB database, whereas the quantitative microbiota composition was determined based on the LFQ intensities relative to each bacterial member, on a family basis. Logarithmic transformation of the cumulative intensities on a family basis was accomplished while comparing the microbiota composition in the diverse sample groups.
Identified protein repertoires were functionally categorized into biological processes and molecular functions of the Gene Ontology (GO) data repository. Abundance profiles of the identified proteins (LFQ values) were subjected to statistical investigation using Primer7 v.7 statistical software (PRIMER-E, Plymouth, UK). Principal Component analysis (PCO) was calculated on the square root transformation of the protein LFQs. Statistical differences across the samples were calculated by performing ANOVA and PERMANOVA. Parametric T-test assessing the discriminating role of the bacterial families on microbiota composition is calculated and visualized into iMetalab using shiny apps (https://shiny.imetalab.ca/). Linear discriminant analysis Effect Size (LEfSe) was calculated in the galaxy platform (https://usegalaxy.org). Heat maps visualizing microbial community composition across the samples and functional classification of the identified proteins were drawn using heatmap.2 provided by the gplots package implemented in R v.4.2.0 software (http://www.R-project.org). Correlation analysis was performed through the corrplot package implemented in R v.4.2.0 software (http://www.R-project.org).