Study design
We employed a multi-centric study of patients undergoing allogeneic stem-cell transplantation to assemble an integrated, longitudinal understanding of the transkingdom taxonomic and metabolomic shifts in each individual. We prospectively collected stool samples at calendar-driven and event-driven timepoints for characterization of the intestinal bacterial, fungal and viral community composition. The bacteriome and fungome were characterized via 16S and ITS amplicon sequencing, while for virome, we first purified viral particles via combination of filtration and ultracentrifugation steps, followed by deep viral whole genome shotgun sequencing.62 In parallel, each sample was assayed by mass-spectrometry for expression of microbiota-derived metabolites including short chain fatty acids, indole derivatives or bile acids (>44 different BAs including primary, secondary and conjugated BAs). Upon this extensive data set we employed multi-omics factor analysis (MOFA). Bio samples were referenced with extensive clinical metadata, including GvHD, anti-infective medication and laboratory findings.
Patient stool samples were collected at pre-defined calendar- and event-driven time-points in accordance with IRB-approved study protocols (see “Patient characteristics” below). Measurements were taken from distinct stool samples of individual patients. The stool samples comprised a unique bio-sample collection and have been completely used for the analyses performed in this study. No additional material is available. The data sets generated and analyzed during the current study are available in the European Nucleotide Archive (https://www.ebi.ac.uk/ena/browser/home).
Patient characteristics
Stool samples were collected at the University Hospital of the Technical University of Munich and the University Hospital Regensburg. At both centers, 78 patients undergoing allogeneic stem-cell transplantation were enrolled (Munich: n= 25 between 2019-2021, Regensburg: n=53 between 2018-2021) in a prospective, observational study after informed consent and in accordance with IRB-approved study protocols (Munich: “Ethikkommission der TU München” IRB 295/18 S; Regensburg: “Ethikkommission bei der Universität Regensburg” IRB 02/220, 14-101-0047, 17-619-101). Sampling occurred at pre-determined time-points (calendar-driven) or in response to clinical occurrences (event-driven) (Figure 1A). Sampling was carefully coordinated between both centers and performed weekly during the peri-transplant period. Samples were obtained on or one to two days around the specified time-points and stored at -80 °C within 4 hours until further processing. Calendar-driven time-points were:
- at in-patient admission to the bone marrow transplantation unit for initiation of conditioning therapy (Day -7)
- at the day of the allogeneic stem-cell transplantation (Day 0)
- at one week after allo-SCT (Day +7)
- at two weeks after allo-SCT (Day +14)
- and weekly thereafter until patients were discharged or up until Day +35 after allo-SCT, whichever occurred first
Event-driven time-points were the occurrence of acute gastrointestinal GvHD, weekly during occurrence of GI-GvHD and after resolution of acute GI-GvHD.
Patient characteristics including age, gender, diagnosis and survival were similarly distributed amongst both cohorts (Figure 1B).
Graft-versus-Host Disease
Gastrointestinal GvHD occurred in approximately 50% of patients. A third of these patients developed severe GI-GvHD, defined as GI-GvHD grade 2 to 4. At occurrence of GvHD, patients received i.v. corticosteroid therapy, the standard-of-care first line treatment at both centers. Steroid resistance was defined according to MAGIC criteria.63 Stool samples of patient with active GI-GvHD were labeled as such, and compared to time-point matched samples from patients without GvHD (control allo-SCT patients).
Antibiotics
Patient samples were grouped according to antibiotic status: Samples categorized as “no ABX” had no exposure, but once a patient was started on antibiotics the first and all subsequent samples were classified as “ABX”. In addition, all samples were annotated according to concomitant anti-infective prophylaxis (site-specific standards, see Supp. Table 3), as well as antifungal and antiviral therapy (Supp. Table 2) at the time of sampling.
Survival Analysis
After follow-up of one year, 68 patients were alive (Figure 1). Two patients died from relapse, 13 died from transplantation related complications (mainly GvHD and infections). Overall survival (OS) and cumulative incidences of transplant-related mortality (TRM) and GvHD were estimated and presented as hazard ratios with 95% confidence interval. Patients were stratified into higher-diversity and lower-diversity groups according to the center-specific median diversity value (inverse Simpson index for bacteriome, fungome and virome, respectively) observed in the samples obtained at days +7 until +21. When a patient had more than one sample, the mean value was used for analysis. In the same way patients were stratified into higher and lower metabolite levels according to the center-specific median intestinal metabolite level observed between days +7 and +21. IBM SPSS 25 was used for analysis.
Individual case reports
For clinical cases of individual patients (Supp. Figure 10), we identified the top and bottom patients with the highest and lowest mean alpha diversity for both MUC and REG centers. Patients with a baseline (Day -7) sample and minimum of 3 additional longitudinal samples were stratified into high or low bacterial diversity by mean Effective Shannon score across all time-points. We characterized Patients 1 and 2, with highest and lowest alpha diversity, respectively. Patient samples were aligned with clinical metadata including body temperature, white blood count (WBC), C-reactive protein (CRP), administration and timing of anti-infective medication including antibiotics as well as of immunosuppressive therapy. Age, underlying disease, donor status, conditioning regimen, incidence of GI-GvHD and additional clinical information is provided in Supp. Table 4.
Bacteriome Analysis
Extraction and quantification of bacterial and fungal nucleic acids
For microbiome analysis stool samples were collected in magiX PB1 microbiome preservation buffer (microBIOMix GmbH, Regensburg) and transferred to the laboratory within four hours (MUC) or within one day (REG). Immediately after receipt, stool suspensions were frozen and stored at –80 °C until further processing. To extract fungal and bacterial nucleic acids, stool suspensions were thawed and ~ 50 mg stool were mixed with a pool of three spike bacteria Alcanivorax borkumensis, Agrobacterium radiobacter, and Alicyclobacillus acidiphilus to 2e+7, 1.6e+7 and 6e+8 16S rDNA copies per sample, respectively. These species served as internal process controls.64 Cells were lysed by applying a repeated bead-beating protocol using 0.5 mm yttria-stabilized zirconium oxide beads (MP Biomedicals, Eschwege, Germany) on a TissueLyser II instrument (Qiagen) followed by Proteinase K digestion (Roche, Mannheim, Germany) and freezing and thawing. DNA was purified by means of the MagNA Pure 96 instrument (Roche) using the MagNA Pure 96 DNA and Viral NA Large Volume Kit (Roche).
Amplification and semiconductor-based sequencing of V1-3 16S rDNA and fungal Internal Transcribed Spacer 1 Regions (ITS1) regions
V1 to V3 hypervariable regions of bacterial 16S rRNA genes were amplified from a total of 1e+7 bacterial 16S rDNA copies for each sample using the forward primer S-D-Bact-0008-c-S-20 containing sample-specific barcode and sequencing adaptor sequences together with reverse primer S-D-Bact-0517-a-A-18 containing a 3’-P1 adapter sequence. Fungal Internal transcribed spacer regions 1 (ITS1) were amplified from 5 ng DNA using primer pairs ITS1-30F/ITS1-217R65 containing barcode and sequencing adaptors. The resulting amplicons were purified with MagSi-NGSPREP Plus beads (Steinbrenner Laborsysteme, Wiesenbach, Germany). Amplicon copy numbers were determined using the KAPA Library Quantification IonTorrent Kit (Roche). DNA library was prepared by pooling equimolar concentrations of adaptor-labeled amplicons. Re-amplification of the final sequencing library by isothermal amplification, chip loading and high-throughput sequencing which was carried out using the IonChef™ instrument and the IonTorrent™ Genestudio S5 Plus sequencer (Thermo Fisher Scientific, Waltham, USA).
Bacteriome and Fungome Quantification
To allow for comparisons across different kingdoms we performed absolute quantification of bacteria and fungi. Bacterial and fungal DNA was quantified by real-time PCR using pan-bacterial (16S) and pan-fungal (28S) specific primers, respectively, and amplicon copy numbers were normalized to gram of stool. Quantification of 16S rDNA copies by qPCR was carried out as described earlier.66 Fungal load was quantified by using primers NL1 and 260R and a 28S rDNA-specific FAM-labelled hydrolysis probe.67 DNA fragments of Penicillium roqueforti 28S rDNA amplified with NL-1/NL-4 primers which were cloned into the pJET1.2 vector (Thermo Fisher Scientific) were used as quantification standard.
Bacteriome and Fungome Data Analysis
Raw sequencing reads were subjected to Trimmomatic68 (0.39) for sliding window trimming applying a windows size of 25 and a quality cutoff of 20. Cutadapt69 (3.4) was used for demultiplexing and removal of sequencing adapters and 16S-specific PCR primer sequences. Amplicon sequence variants (ASV) were generated in R (4.1.0) from demultiplexed reads using a dada2 (1.22)-based workflow70. Standard parameters were used, except gap penalty, band size and OMEGA_A values, which were set to –1, 32 and 1e-30 respectively. Taxonomy of ASVs was predicted by the IDTAXA algorithm of the DECIPHER71 package after matching to the SILVA72 138.1 release 16S reference database (bacteria). The February 2020 release of the Unite database73 (fungal ITS) was used to predict fungal taxonomy. Alpha- and beta-diversity analyses were conducted applying the phyloseq74 package version 1.36. Species Richness was represented by summarized ASV. Effective Shannon index was calculated by the exponent of the Shannon entropy75. Means of groups were compared with the Wilcoxon Rank Sum test (two-sided) adjusted for multiple comparisons with the Bonferroni method. Beta diversity was calculated by weighted Unifrac and distances between time-points or groups (No ABX versus ABX; No GI-GvHD versus GI-GvHD) were projected in a principal coordinate analysis (PCoA). Principal coordinates were calculated from weighted UniFrac distances after cumulative sum scaling of ASV counts. Significance of group differences were analyzed by PERMANOVA using the pairwiseAdonis76 (0.4) package. For Pearson correlation analyses, the correlation coefficient and p-value were calculated with the ggpubr package. Linear discriminant analysis based on the LEfSe and generation of cladograms were carried out with the microbiomeMarker package77 (1.0.1) after total sum scaling and applying alpha level and log2 fold change cutoffs of 0.01 and 2 respectively. Heatmaps for most abundant fungal and bacterial taxa were generated from summarized ASV reads after centered log-ratio normalization. Dominant taxa were identified by summarizing bacterial and fungal reads with a maximum relative genus-level abundance above 50 percent and plotted by ggplot2 (3.3.5). Plots for alpha diversity and bacterial and fungal loads were generated by ggpubr (0.4). P-values are shown either as exact value (Figure Legends) or as P-value summary (Figures): ns…not significant (p > 0.05); *…p <0.05; **…p <0.01; ***…p <0.001; ****…p <0.0001.
Virome Analysis
Virome sequence processing
Low-quality bases and adaptors of raw sequences were removed using fastp (v0.20.1)78. The remaining reads were deduplicated by dedupe.sh from bbmap suite (v38.76)79, and clean reads were assembled into contigs by metaSPAdes (v3.14.0) with default settings.80 Then contigs longer than 1kbp were pooled from all samples, and redundant contigs were filtered using dedupe.sh. Afterward, viral contigs were predicted with the combination of VirSorter (v1.0.6)81, CAT (v5.0.4)82, and DeepVirFinder(v1.0)83. The viral contigs were clustered by CD-HIT84; viral contigs that shared more than 95% identity over 80% of the contig length were considered identical and the longest viral contigs in each cluster were used as representative viral contigs for further analysis.
Virome data analysis
Viral richness (Ace), diversity (Effective Shannon Index) and Principal Coordinates Analysis (PCoA) based on “Bray-Curtis” similarities were conducted in R (Version 3.2, package vegan). The significant difference was determined by Permutational Multivariate Analysis of Variance (PERMANOVA). Means of groups were compared with the Wilcoxon Rank Sum test (two-sided) adjusted for multiple comparisons with the Bonferroni method. The corr.test function in Psych R package v2.1.9 was used to compute the Pearson correlation and multiple comparisons were corrected using Holm method (alpha=0.05). Graphs were generated in Prism 9 (GraphPad), Origin 2020b, Microsoft Excel, and R (Version 3.3.3, ggplot2 package). P-values are shown either as exact value (Figure Legends) or as P-value summary (Figures): ns…not significant (p > 0.05); *…p <0.05; **…p <0.01; ***…p <0.001; ****…p <0.0001.
Metabolome Analysis
Background
Metabolite screening was preceded by a comprehensive testing phase, and we determined that for multi-centric comparisons the use of unprocessed, fresh-frozen native stool samples was essential. Clinical experience showed us that stool consistency is widely divergent, especially in patients suffering from GI-GvHD with frequent bouts of watery diarrhea. Therefore, we measured metabolite levels in µmol per gram of dry feces to enable cross-sample comparisons regardless of stool consistency. Our methodology was accurate enough to detect reproducible and significant differences in metabolite levels by time-point or diagnosis down to the nmol per gram range.
Sample preparation for targeted analysis
Approximately 100 mg of stool was weighed in a 15 mL bead beater tube (CKMix50 15 mL, Bertin Technologies, Montigny-le-Bretonneux, France) filled with 2.8 mm and 5.0 mm ceramic beads i.d. 5 mL of methanol-based dehydrocholic acid extraction solvent (c=1.3 µmol/L) was added as an internal standard for work-up losses. The samples were extracted with a bead beater (Precellys Evolution, Bertin Technolgies) supplied with a Cryolys cooling module 3 times each for 20 seconds with 15 seconds breaks in between at 10,000 rpm. 1 mL of the fecal suspension was dried in a vacuum centrifuge (Eppendorf Vacufuge) to determine the dry weight.
Targeted short chain fatty acid (SCFA), lactic acid, desaminotyrosine and indole-3-carboxyaldehyde measurement (Panel 1)
The 3-NPH method was used for the quantitation of SCFAs as well as lactic acid, desaminotyrosine and indole-3-carboxyaldehyde85. Briefly, 40 µL of the fecal extract and 15 µL of 50 µM isotopically labeled standards were mixed with 20 µL 120 mM EDC HCl-6% pyridine-solution and 20 µL of 200 mM 3-NPH HCL solution. After 30 min at 40°C and shaking at 1000 rpm using an Eppendorf Thermomix (Eppendorf, Hamburg, Germany), 900 µL acetonitrile/water (50/50, v/v) was added. After centrifugation at 13000 U/min for 2 min the clear supernatant was used for analysis. The same system as described above was used. The electrospray voltage was -4500 V, curtain gas 35 psi, ion source gas 1 55 psi, ion source gas 2 65 psi and the temperature 500°C. The MRM-parameters were optimized using commercially available standards. The chromatographic separation was performed on a 1.7 μm, 100 × 2.1 mm, 100 Å Kinetex C18 column (Phenomenex, Aschaffenburg, Germany) column with 0.1% formic acid (eluent A) and 0.1% formic acid in acetonitrile (eluent B) as elution solvents. An injection volume of 1 µL and a flow rate of 0.4 mL/min was used. The gradient elution started at 23% B which was held for 3 min, afterward the concentration was increased to 30% B at 4 min, with another increase to 40%B at 6.5 min, at 7 min 100% B was used which was hold for 1 min, at 8.5 min the column was equilibrated at starting conditions. The column oven was set to 40°C and the autosampler to 15°C. Data acquisition and instrumental control were performed with Analyst 1.7 software (Sciex, Darmstadt, Germany). The data was analyzed with MultiQuant 3.0.3 (Sciex, Darmstadt, Germany) and Metaboanalyst86. Features with more than 70% of missing values were removed. Missing values were replaced by LoDs (1/5 of the minimum positive value of each variable). Heatmaps were generated after normalization by log transformation (base 10). Means of multiple groups (time-points) were compared with the Kruskal-Wallis test followed by Dunn’s multiple comparisons test. Comparisons between two groups (No ABX versus ABX; No GI-GvHD versus GI-GvHD) were performed with the Mann-Whitney test (two-sided). Graphs and statistics were created in Prism 8 (GraphPad). P-values are shown either as exact value (Figure Legends) or as P-value summary (Figures): ns…not significant (p > 0.05); *…p <0.05; **…p <0.01; ***…p <0.001; ****…p <0.0001.
Targeted bile acid measurement (Panel 2)
Bile acid analysis was performed according to Reiter et al. 202087. Briefly, 20 µL of isotopically labeled BAs (ca. 7 µM each) were added to 100 µL of fecal extract. Targeted bile acid measurement was performed using a QTRAP 5500 triple quadrupole mass spectrometer (Sciex, Darmstadt, Germany) coupled to an ExionLC AD (Sciex, Darmstadt, Germany) ultrahigh performance liquid chromatography system. A multiple reaction monitoring (MRM) method was used for the detection and quantification of the BAs (Reiter et al., 2020). An electrospray ion voltage of −4500 V and the following ion source parameters were used: curtain gas (35 psi), temperature (450 °C), gas 1 (55 psi), gas 2 (65 psi), and entrance potential (−10 V). The MS parameters and LC conditions were optimized using commercially available standards of endogenous BAs and deuterated BAs, for the simultaneous quantification of selected 34 analytes. For separation of the analytes a 100 × 2.1 mm, 100 Å, 1.7 μm, Kinetex C18 column (Phenomenex, Aschaffenburg, Germany) was used. Chromatographic separation was performed with a constant flow rate of 0.4 mL/min using a mobile phase consisted of water (eluent A) and acetonitrile/water (95/5, v/v, eluent B), both containing 5 mM ammonium acetate and 0.1% formic acid. The gradient elution started with 25% B for 2 min, increased at 3.5 min to 27% B, in 2 min to 35% B, which was hold until 10 min, increased in 1 min to 43% B, held for 1 min, increased in 2 min to 58% B, held 3 min isocratically at 58% B, then the concentration was increased to 65% at 17.5 min, with another increase to 80% B at 18 min, following an increase at 19 min to 100% B which was hold for 1 min, at 20.5 min the column was equilibrated for 4.5 min at starting. The injection volume for all samples was 1 μL, the column oven temperature was set to 40 °C, and the auto-sampler was kept at 15 °C. Data acquisition and instrumental control were performed with Analyst 1.7 software (Sciex, Darmstadt, Germany). The data was analyzed as indicated in Panel 1.
Multi-omics Factor Analysis (MOFA)
MOFA accepts all multi-omics data from the same set of samples, and infers a set of factors that explain sample heterogeneity or variance. Although these factors are derived from all available multi-omics modalities (16S, ITS, virome and metabolomics), the contribution of each omics modality to total variance and each factor can be distinguished.88
We opted for an unbiased approach, and decided against inputting clinical and sample meta-data such as diagnosis or time-point for model generation, in order to identify whether factors inferred only from multi-omics data could explain heterogeneity in our study population.
To generate a model by which MOFA could identify factors, we integrated bacterial and fungal ASVs as well as assembled viral contigs at genus level with Panel 1 and Panel 2 metabolite data as published45. The input consisted of bacterial 16S rRNA ASVs, fungal ITS1 rRNA ASVs, viral sequences and metabolite concentrations. All inputs were normalized by centralized log normalization (metagenomic data) or log transformation (base 10, metabolomic data). After model fitting, we selected for the top 10 factors with minimum of 1.14% average variance explained across all modalities and with varying degrees of variance explained across the different kingdoms (Figure 3B), ordered by total variance explained (i.e., Factor 1 contributed the most, Factor 10 the least to total variation). For each factor, MOFA learns a weight for every feature contributing to that factor, which can be interpreted as a measure of feature importance (Figure 3D). Larger weights (approaching 1.0) indicate a higher correlation with that factor, while the positive or negative sign indicates the directionality of that variation (a positive weight represents a positive association and vice versa).
Fecal microbiota transplantation
We performed fecal microbiota transplantation (FMT) within compassionate use in accordance to protocols approved by Bavarian authorities. Regulations in Germany allow for the treatment of individuals within compassionate use („individueller Heilversuch“). This requires approval of the local government. We have obtained this approval (General Administration of the Free State of Bavaria, reference number „2677.Ph_3-748-1“, issued to PD Dr. med. Christian Schulz) as well as patient informed consent to treatment and informed consent to publication (according to recommendations by the Committee on Publication Ethics).
FMT was performed as last-line treatment, of severe grade IV GI-GvHD that was refractory to steroids, ruxolitinib and immunosuppressive therapy. The recipient patient (Patient 3, Supp. Table 4) received two separate FMTs at Days 0 and +8 from the same healthy, unrelated FMT donor. At Day +13, we were able to discharge the patient in good condition. She is alive and well at the time of writing.
Prior to transplantation, the FMT donor was screened according to site-specific standard operating procedures at the Ludwig-Maximilians University Munich and the University Hospital Regensburg in compliance with regulations issued by the German Federal Institute for Drugs and Medical Devices (Supp. Table 5). Of note, SARS-CoV-2 screening was not performed as the FMT occurred before the COVID-19 pandemic. The FMT product consisted of a homogenized suspension of freshly-sampled feces in sterile saline solution and was applied via colonoscopy at the MUC endoscopy suite. During and following the procedure, the patient’s vital signs (ECG, blood pressure, pulse, temperature, peripheral oxygen saturation) were monitored.
The following criteria were used to assess response to FMT:
- CR (complete response) was defined as complete resolution of GI-GvHD.
- VGPR (very good partial response) was defined as improvement of at least 2 stages in the severity of GI-GvHD except improvement to stage 0.
- PR (partial response) was defined as improvement of one stage in the severity of GI GvHD except improvement to stage 0.
Histology
For confirmation of acute GI-GvHD via histopathology, Patient 3 underwent colonoscopy and intestinal biopsies were obtained from the sigma at Day -5 (Before FMT) and Day + 289 (after FMT). Intestinal biopsies were formalin-fixed and embedded in paraffine. Slides were cut with a rotary microtome and stained with hematoxylin-eosin and IHC was performed for CD3 (DCS, clone SP7) and Ki67 (Dako, clone MIB1) (Supp. Table 6). The grading of GvHD was performed according to Lerner (1974)33. All slides were scanned with a slide scanner (Leica, AT2) and images were acquired using Aperio ImageScope (v. 12.3.1.5011).
ChipCytometry
ChipCytometry of human FFPE biopsies was performed according to the procedure described by Jarosch et al.89. Briefly, tissue sections were rehydrated on coverslips and antigen retrieval was performed using TRIS-EDTA buffer (pH 8.5). Sections were then transferred to CellSafe Chips (ZELLKRAFTWERK) and cyclic immunofluorescence with photobleaching was performed on the chip (Supp. Table 6).