All experimental procedures were approved by the Ethics Committee of Chongqing Medical University (Chongqing, China; 2017013).
FMT experiments
For the FMT experiments, male GF mice (aged 5–6 weeks, weight 30–40 g) were obtained from the NHC Key Laboratory of Diagnosis and Treatment on Brain Functional Diseases (Chongqing, China) and kept in flexible film gnotobiotic isolators before the behavioral tests. During the study, the mice were fed autoclaved chow and water ad libitum. As described previously [15], fecal samples from randomly selected subsets of five patients (drug-naive MDD, 3 females, age 57.2 ± 8.64; HAMD scores ≥ 17) and five health controls (HC, 3 females, age 49.8 ± 7.36) were used to colonization. These participants were recruited from the psychiatric department and medical examination center of the First Affiliated Hospital of Chongqing University with all providing written informed consent. Diagnosis of MDD was assessed by the Structured Psychiatric Interview using DSM-5-TR by experienced psychiatrists.
Briefly, the fecal samples were suspended with sterile PBS (15 mL/g), equal volumes combined for each group. Then these inoculum samples were vortexed for 5 min, followed by 5-min standing to precipitate particles. Each GF mouse was gavaged with 200 mL of fecal suspensions derived from patients or HC. After colonization, the mice remained in a gnotobiotic environment under the same feeding conditions as before. Last, mice underwent behavioral testing 2 weeks after microbiota transplantation.
CSDS experiments
For the CSDS experiments, male C57BL/6J mice (6–8 weeks of age) and CD1 mice (18–20 weeks of age) were obtained from the Animal Center of Chongqing Medical University (Chongqing, China) and had free access to food and water. The room environment for all mice were same (temperature 22 ± 2 ◦C, humidity 50 ± 15%).The CSDS experimental was performed according to the standard protocol [16]. Briefly, the C57BL/6J mice in stressed group were exposed to daily defeats by a novel CD1 mice for 5 minutes over a period of 10 days. Control mice were housed in pairs, with a transparent plastic sheet separating them from each other, and the pairs rotated daily. The detailed experimental procedure has been described in our previous study [13].
Behavioral testing
The depressive-like behaviors of mice were accessed by the measurements of sucrose preference test (SPT), social interaction test (SIT), open filed test (OFT), and force swimming test (FST). Briefly, mice were single housed with two drinking bottles (one tap with water and another with 1% sucrose solution). Sucrose preference was calculated using liquid consumption in 24 h. When the SIT was performed, the mice were placed into a test filed with or without targeted a CD1 mouse. We evaluated social avoidance behavior of mice by recording their movement tracks in different zones (Ethovision 10.0, Noldus, Wageningen, the Netherlands). OFT was performed using an open filed experiment box. Noldus software was used to record and analyze trajectories of mice. The total distance, time spent, and frequency of entrances in the central area were calculated. Finally, we placed experimental mice in a transparent acrylic cylinder with tap water (23 ± 1 ◦C) and recorded the immobility time, respectively. We didn’t perform SPT and SIT on GF mice due to the limitations of the gnotobiotic isolators.
RNA isolation and sequencing
After the behavior tests, brain tissues were rapidly dissected, frozen quickly with liquid nitrogen, and stored at − 80 ◦C. Total RNA was isolated from six mice in each group using TRIzol (Life Technologies, CA, USA). The quality and quantity of extracted RNA were measured using NanoPhotometer (IMPLEN, CA, USA) and Qubit2.0 Flurometer (Life Technologies, CA, USA), respectively. RNA integrity (RIN) was assessed using the RNA Nano6000 Assay Kit of the Bioanalyzer (Agilent, CA, USA). Sequencing libraries were constructed using NEBNext UltraTM RNA Library Prep Kit for Illumina (NEB, USA) following the manufacturer’s recommendations. Illumina NovaSeq6000 platform was used to generated 150 bp paired end reads.
Acquisition of published data CUMS model and human brain
RNA-seq studies of ACC from CUMS model and MDD patients were searched in the Gene Expression Omnibus database (GEO, https://www.ncbi.nlm.nih.gov/gds). A completement transcriptomic dataset were obtained with the accession number GSE102556, including 46 original fastq files (12 MDD cases, 15 HCs, 10 control mice and 9 depressive-like mice induced by CUMS) [11]. The detailed experimental design, sample processing, library preparation, and sequencing procedures were described in the GEO project homepage and the original published manuscript. The information of samples is described in Table S1.
Transcriptomics data processing
To avoid methodological bias, both newly sequenced and public files were processed uniformly. Fastp software (v0.21.0) was used to trim adapters and low-quality reads [17]. Hisat2 tools (v2.2.1) were used to build genome index (Mus_musculus_Ensemble_102 and Homo_sapiens_Ensemble_108) and performed the read alignment [18]. Quality control of alignment was performed by principal component analysis (PCA) using sequencing metrics generated by Picard Tools (v1.131) [19]. The read counts of per gene were calculated using featureCounts (v1.5.0-p3) [20]. Genes expressed with count > 5 in at least 80% of the samples in at least one dataset were kept. Counts were normalized to the fragments per kilobase of transcript per million mapped reads (FPKM) and adjusted for sequencing metrics. Outliers were detected based on sample network connectivity (Z scores), and samples with Z < − 2 were removed [21]. All genes were converted to human one-to-one orthologues for species-cross comparison using the biomaRt package [22].
Global expression pattern analysis
To compare the global transcriptomic effects of microbiome and chronic stress in brain, we separately performed threshold-free analyses at levels of gene expression and function using RRHO and GSEA.
Rank-rank hypergeometric overlap (RRHO) test: RRHO test is developed for assessing the similarity of gene expression trends across different datasets [23]. All genes were ranked by − log10 (p-value) multiplied by the sign. A one-sided test was used to determine over-enrichment. Heatmaps represent the degree of significant overlap between the two ranked gene lists.
Gene set enrichment analysis (GSEA): GSEA is a powerful threshold-free algorithm for determining whether a set of genes is differentially expressed in different groups [24]. We used WebGestaltR package (ver. 0.4.4) to perform GSEA based on Gene Ontology (GO) terms with 1,000-time permutation tests. Terms with absolute value of normalized enrichment score (NES) greater than 1 and false positive discovery rate (FDR) less than 0.25 were considered to be significant [25]. The rrvgo package (ver. 1.2.0) was used to classify and visualize results by remove redundance based on semantic similarity [26].
Differentially expressed genes (DEGs) analysis
Identification and functional analysis of DEGs: To explore the distinct impacts of different models on gene quantity and functions, DEGs were identified using linear model via limma-voom package [27]. For human data, models were adjusted for gender, age, RNA integrity number (RIN), brain pH and postmortem interval (PMI). The classification of DEGs was performed using the PANTHER tools (ver. 17.0, https://pantherdb.org/) [28]. The over-representation analyses of biological processes and pathways were implemented by EnrichR (ver. 3.2) package [29]. SimplifyEnrichment package (ver. 1.7.2) was utilized for clustering and visualizing functional terms [30]. Given that the transcriptomic feature of depression is formed by the accumulation of many small genetic alterations, p-value of 0.05 was adopted for nominal significance in this section, facilitating the identification of a broader spectrum of changes [13, 31].
Cell type enrichment and deconvolution: To investigate the specific neural cell types impacted, we conducted one-sided Fisher’s exact test-based enrichment analysis of DEGs utilizing cell type gene markers from published single-cell RNA-seq (scRNA-seq) studies with the threshold of p-value < 0.05 [32]. Furthermore, we integrated the scRNA-seq data and our bulk-Seq data to survey the alterations of cell type composition. Briefly, we generated a cell type signature matrix using smart-seq data from ALLEN BRAIN MAP (https://portal.brain-map.org/), including six cell types (including GABAergic neuron, glutamatergic neuron, astrocyte, oligodendrocyte, microglia, and endothelia) [33]. Subsequently, the cell type signature matrix and bulk-Seq expression matrices were imported to CEBERSORT R script to perform deconvolution [34].
Multiscale embedded gene co-expression network analysis (MEGENA)
Constructions of gene co-expression network: To investigate the impact of chronic stress and gut microbiota dysbiosis on brain gene interactions, we constructed hierarchical gene co-expression networks for mouse models and MDD patients using MEGENA package (ver. 1.3.7) [35]. The parameters were set with minimum module size = 50 and permutation test = 1,000. The modules of MDD network were labeled by M#, and modules of mouse networks were labeled by F# (FMT), S# (CSDS) and U# (CUMS). We accessed the association between modules and depressive phenotype by performing enrichment analysis to DEGs for each module. One-sided Fisher’s exact test was applied with the significance of p-value < 0.05.
Preservation analysis of network: As described in our previous study [13], we performed network preservation analysis between mouse models and MDD patients using NetRep package (ver. 1.2.4) to ensure that the identified mouse modules are strongly linked to the disease [36]. Briefly, we alternated between using the mouse network and the MDD network as the reference dataset and the test dataset to compute seven preservation statistics for each module. Permutation tests were conducted 10,000 times to make the results robust. The module which was strongly preserved with the p-values of all seven statistics were less than 0.001.
Module different connectivity (MDC) analysis: MDC analysis was used to estimate the alterations of connectivity among co-expressed genes between different groups. The significance of MDC was calculated based on the maximum FDR of 50-time permutation of samples and genes in each module [37].
Protein-protein interaction (PPI) network analysis: The genes of a given module were imported into STRING database (https://cn.string-db.org/) to construct PPI networks. Markov Cluster Algorithm (MCL) was performed with default parameters to divide a large network into several functional clusters [38].
Statistical analyses
The results of behavioral tests and the demographic information of public human dataset are expressed as the mean ± standard error. All analyses and data visualization were performed using R (ver. 4.2.1). The Shapiro–Wilk test was used to determine the normality of data. Student’s t–test was used to analyze normally distributed data, while the Wilcoxon–rank sum test for non–normally distributed data. Fisher’s exact test was used to assess categorical variables.