Patients and specimen collection. Twenty-three patients were enrolled in this study, which was conducted in the pediatric and adult intensive care units (ICUs) of the 7th and 8th Medical Centers of the Chinese PLA General Hospital between June 2022 and September 2023. Detailed participant information is presented in Table S1. The inclusion criteria were ICU admission and a diagnosis of severe ARDS owing to viral pneumonia, with the participants being children under 15 years of age or adults > 45 years of age. The exclusion criteria included respiratory failure due to other acute or chronic pulmonary diseases, refusal of resuscitation or end-of-life status, and severe comorbidities such as immunosuppression, organ transplantation history, and disseminated cancer. All the patients were managed using a standardized clinical protocol. Sex was not considered as a biological variable.
Peripheral blood samples were divided into two groups: ARDS and control. The ARDS group included six children and six adults diagnosed with ARDS based on the Berlin Definition for ARDS and Pediatric Acute Lung Injury Consensus Conference criteria (44, 45)(Pediatric Acute Lung Injury Consensus Conference, 2015; Ranieri et al., 2012). The control group consisted of five adults from health screening and six children undergoing elective surgery, all free of infection or acute inflammatory response (Table S1).
For patients with ARDS, blood samples were collected within 24 h of initiating invasive mechanical ventilation or ECMO (T1) and again when their condition improved sufficiently to discontinue these supports (survival-T2). If life support was unsuccessful, a final sample (death-T2) was collected before terminating the extracorporeal life support.
The participants were categorized into seven groups: child control, adult control, PARDS-T1 (subgroup: PARDS-T1-survival; PARDS-T1-death), AARDS-T1, PARDS-T2 (subgroup: PARDS-T2-survival; PARDS-T2-death), and AARDS-T2. Data on patient demographics, laboratory tests, microbiological findings, and medical interventions were meticulously obtained from electronic medical records and cataloged using Microsoft Excel.
Processing of blood samples for scRNA-seq. Venous blood samples (3 mL) were collected from each participant in Vacutainer EDTA tubes (BD, Cat#: 367861). The collected samples were incubated at 20°C for 20 minutes before being diluted with Hank's Balanced Salt Solution (Solarbio, Catalog #: H1025). Red blood cells were lysed using 5 mL of buffer at 26°C for 2 min. The reaction was stopped by adding 20 mL of phosphate-buffered saline (PBS) containing 10% Fetal Calf Serum, followed by centrifugation at 300 g for 10 min. The resultant cells were then resuspended in 1 mL of PBS kept at 0°C, achieving a final concentration between 7–12×10⁵ cells/mL with viability over 80%. scRNA libraries were constructed using the 10x Genomics® 5' kit v2 (10x Genomics, Cat#: PN-1000263).
scRNA-seq data analysis. For scRNA-seq data, raw datasets were aligned and quantified against the GRCh38 human reference genome utilizing Cell Ranger software (version 4.0.0) to obtain gene-cell matrices. Doublet cells were removed using DoubletFinder (version 2.0.3). Subsequent data processing steps, including quality control, handling of batch effects, dimension reduction, and clustering, were managed with the Seurat package (version 4.3.0). Quality control ensured total RNA counts between 200 and 5,000 and mitochondrial gene percentages < 10%. The top 2,000 most variable genes were determined via the "FindVariableFeatures," "ScaleData," and "RunPCA" functions. Batch effects were addressed by using the Harmony package (version 0.1.1). Clusters were identified using “FindNeighbors,” “FindClusters,” and “RunUMAP” and visualized with UMAP. Cell types were determined using canonical markers.
For differential expression analysis, genes for cell types with > 20 cells were identified using the "FindMarkers" function in Seurat, and visualized with "VlnPlot" (adjusted p-value using Bonferroni correction < 0.05). Heatmaps illustrating gene expression were created using Complex Heatmap software (version 2.12.0). Biological processes were enriched based on GO analysis with p-values < 0.01 using Metascape11 (version 3.5, https://metascape.org/). GO terms were visualized and scored using the Seurat package.
All analyses were performed using the R statistical software.
Cell–cell communication analysis. CellChat was employed to deduce and scrutinize intercellular communication networks using data derived from both our study and public datasets, leveraging its ligand-receptor interaction databases (http://www.cellchat.org/).Interactions were discerned and assessed based on the expression levels of ligands and receptors differentially overexpressed in each cell group (P < 0.05). Comparisons of communication probabilities in ligand-receptor pairs regulated by specific cell populations were made using the "netVisual_bubble" function, configured with the compare parameter.
BCR analysis. Single-cell V(D)J sequencing was executed following protocols provided with the 10X Genomics Chromium Single Cell Immune Profiling Solution. Data were processed using the Cell Ranger pipeline (version 6.0.0, 10X Genomics), then analyzed using the Loupe V(D)J Browser v 4.0.0. V(D)J sequences were assembled, and paired clonotypes were called using Cellranger vdj with the reference set to refdata-cellranger-vdj-GRCh38-alts-ensembl-6.0 for each sample. The scRepertoire package (version 3.12) was used for data analysis. Diversity was quantified through four indices: Shannon, Inverse Simpson, Chao1, and Abundance-based Coverage Estimator (ACE), where the former two estimated baseline diversity, and Chao1 and ACE indices assessed sample richness. The clonal homeostasis function distinguished the relative proportions of clone sizes (single, small, medium, large, and hyperexpanded).
Single-cell-wise gene signature scoring based on scRNA-seq data. Gene sets for scoring were sourced from the Gene Set Enrichment Analysis (GSEA, www.gsea-msigdb.org/gsea/msigdb/collections.jsp#H). The UCell R package was utilized to estimate gene signature enrichment scores (UCell scores) in single-cell datasets by employing the Mann–Whitney U test on the count matrix derived from scRNA-seq data to evaluate per-cell enrichment for each gene set.
External validation by public datasets. Two public datasets (GSE157789 and GSE163668) were used to validate our data. These datasets included adults with ARDS induced by viral infections, featuring survivors and deceased individuals, and single-cell sequencing data, including neutrophils. In GSE157789, we included five controls (G1, G2, G3, G4, and G5), seven survivors (C1, C3, C4, C6, C7, C8, and C9) and one deceased individual (C5). In GSE163668, we included 14 controls (ICC_0003, ICC_0004, ICC_0005, ICC_4096, ICC_4319, ICC_4923, ICC_3231, ICC_1117, ICC_0001, ICC_1084, ICC_4444, ICC_4799, ICC_4117, and ICC_2867), 9 survivors (1001, 1003, 1002, 1010, 1031, 1038, 1046, 1055, 1060), and 1 deceased individual (1047). After merging these datasets and removing batch effects, we analyzed them using the same methods as those used in our single-cell transcriptome sequencing. The scRepertoire package (version 3.12) was used for data analysis. Diversity was quantified through four indices: Shannon, Inverse Simpson, Chao1, and Abundance-based Coverage Estimator (ACE), where the former two estimated baseline diversity, and Chao1 and ACE indices assessed sample richness. The clonal homeostasis function distinguished the relative proportions of clone sizes (single, small, medium, large, and hyperexpanded).
Statistical analysis. Statistical analyses were conducted using R software (version 4.3.0) and GraphPad Prism (version 9). To compare the means or medians of continuous variables, Wilcoxon rank-sum test and t-test were employed. For multiple group comparisons, P-values were adjusted utilizing the Bonferroni post hoc test. Metascape (version 3.5, https://metascape.org/) served for the enrichment analysis of biological processes based on GO analysis with a significance threshold set at p < 0.01 (cumulative hypergeometric distribution). The gene set module in the Seurat package was used to visualize and score the GO terms.