Predictive proteomic profile for severe COVID-19 is correlated with inflammatory factors among healthy individuals
First, we derived and validated a protein risk score which related to the progression to severe COVID-19. Based on a prior serum proteomic profiling of COVID-19 patients, 22 proteomic biomarkers contributed to the prediction of progression to severe COVID-19 status (18 non-severe cases and 13 severe cases) [11].Using this cohort, we constructed a blood PRS among the 31 COVID-19 patients based on 20 proteomic biomarkers (Table S2). We only used 20 of the 22 proteins for our PRS construction because 2 proteins were unavailable in our large proteomics database among non-infected participants for the further analysis. Among the COVID-19 patients, Poisson regression analysis indicated that per 10% increment in the PRS there was associated a 57% higher risk of progressing to clinically severe phase (RR, 1.57; 95% CI, 1.35–1.82; Fig. 2A), in support of the PRS as being a valid proxy for the predictive biomarkers of severe COVID-19.
To explore the potential implication of the PRS among non-COVID-19 individuals, we constructed the PRS using the same set of 20 blood proteins among a cohort of non-infected participants with data of both proteomics and inflammatory markers (n = 990). The blood proteomic data was based on the baseline serum samples of the cohort (Figure S1). We investigated the correlation between the PRS and blood inflammatory markers IL-1β, IL-6, TNF-α and hsCRP. The PRS had a significantly positive correlation with serum concentrations of hsCRP and TNF-α (p < 0.001 and p < 0.05, respectively), but not other markers (Fig. 2B). As age and sex are very important factors related to the susceptibility to SARS-CoV-2 infection, we performed subgroup analysis stratified by age (< 58 years vs. ≥58 years, with 58 years as the median age of this cohort) and sex. Interestingly, we found that higher PRS was significantly correlated with higher serum concentrations of all the aforementioned inflammatory markers among older individuals (> 58 years, n = 493), but not among younger individuals (≤ 58 years, n = 497) (Fig. 2B and 2C). The PRS did not show any differential association with the inflammatory markers by sex (Figure S2). Whether the identified proteomic changes causally induce immune activation or consequences of the immune response are not clear at present, but the finding supports the hypothesis that the PRS may act as a biomarker of unbalanced host immune system, especially among older adults.
Core microbiota features predict COVID-19 proteomic risk score and host inflammation
To investigate the potential role of gut microbiota in the susceptibility of healthy individuals to COVID-19, we next explored the relationship between the gut microbiota and the above COVID-19-related PRS in a sub-cohort of 301 participants with measurement of both gut microbiota (16 s rRNA) and blood proteomics data (Figure S1). Gut microbiota data were collected and measured during a follow-up visit of the cohort participants, with a cross-sectional subset of the individuals (n = 132) having blood proteomic data at the same time point as the stool collection and another independent prospective subset of the individuals (n = 169) having proteomic data at a next follow-up visit ~ 3 years later than the stool collection.
Among the cross-sectional subset, using a machine learning-based method: LightGBM and a very conservative and strict tenfold cross-validation strategy, we identified 20 top predictive operational taxonomic units (OTUs), and this subset of core OTUs was strongly predictive of PRS (Cross-validated Pearson’s r = 0.59, p < 0.001 across ten cross-validations). The predictive capacity for PRS based on the core OTUs substantially outperformed that of demographic characteristics and laboratory tests including age, BMI, sex, blood pressure and blood lipids (Pearson’s r = 0.154, p = 0.087) (Fig. 3A). The list of these core OTUs along with their taxonomic classification is provided in Table S3. These OTUs were mainly assigned to Bacteroides genus, Streptococcus genus, Lactobacillus genus, Ruminococcaceae family, Lachnospiraceae family and Clostridiales order.
Additionally, we used co-inertia analysis (CIA) to further test co-variance between the 20 identified core OTUs and 20 predictive proteomic biomarkers of severe COVID-19, outputting a RV coefficient (ranged from 0 to 1) to quantify the closeness. The results indicated a close association of these OTUs with the proteomic biomarkers (RV = 0.12, p < 0.05) (Figure S3A). When replicating this analysis stratified by age, significant association was observed only among older participants (age ≥ 58, n = 66; RV = 0.22, p < 0.05) (Figure S3B and S3C).
Importantly, the above results from cross-sectional analyses were successfully replicated in the independent prospective subset of 169 individuals, which showed a Pearson’s r of 0.18 between the core OTUs-predicted PRS versus actual PRS (p < 0.05), also outperforming the predictive capacity of the above demographic characteristics and laboratory tests (Pearson’s r = 0.08, p = 0.31) (Fig. 3A). These findings support that change in the gut microbiota may precede the change in the blood proteomic biomarkers, inferring a potential causal relationship.
To further verify the reliability of these core OTUs, in another larger independent sub-cohort of 366 participants (Figure S1), we examined the cross-sectional relationship between the core OTUs and 10 host inflammatory cytokines including IL-1β, IL-2, IL-4, IL-6, IL-8, IL-10, IL-12p70, IL-13, TNF-α and IFN-γ, and found 11 microbial OTUs were significantly associated with the inflammatory cytokines (Fig. 3B). Specifically, Bacteroides genus, Streptococcus genus and Clostridiales order were negatively correlated with most of the tested inflammatory cytokines, whereas Ruminococcus genus, Blautia genus and Lactobacillus genus showed positive associations.
Fecal metabolome may be the key to link the PRS-related core microbial features and host inflammation
We hypothesized that the influences of the core microbial features on the PRS and host inflammation were driven by some specific microbial metabolites. So we assessed the relationship between the core gut microbiota and fecal metabolome among 987 participants, whose fecal metabolomics and 16 s rRNA microbiome data were collected and measured at the same time point during the follow-up visit of the participants (Figure S1). After correction for the multiple testing (FDR < 0.05), a total of 183 fecal metabolites had significant correlations with at least one selected microbial OTU. Notably, 45 fecal metabolites, mainly within the categories of amino acids, fatty acids and bile acids, showed significant associations with more than half of the selected microbial OTUs (Fig. 4A), these metabolites might play a key role in mediating the effect of the core gut microbiota on host metabolism and inflammation.
Based on these key metabolites, we performed metabolic pathway analysis to elucidate possible biological mechanisms. The results showed that these 45 fecal metabolites were mainly enriched in three pathways, namely aminoacyl-tRNA biosynthesis pathway, arginine biosynthesis pathway, and valine, leucine and isoleucine biosynthesis pathway (Fig. 4B). There were 15 fecal metabolites involved in the aminoacyl-tRNA biosynthesis pathway, which is responsible for adding amino acid to nascent peptide chains and is a target for inhibiting cytokine stimulated inflammation (Fig. 4C). Additionally, 4 metabolites were associated with arginine biosynthesis pathway and 3 metabolites were enriched in valine, leucine and isoleucine (known as branch-chain amino acids, BCAAs) biosynthesis pathway (Fig. 4C).
Host and environmental factors are correlated with the PRS-related core microbial OTUs
As demographic, socioeconomic, dietary and lifestyle factors may all be closely related to the gut microbiota, we explored the variance contribution of these host and environmental factors for the identified core OTU composition. A total of 40 items belonging to two categories (i.e., demographic/clinical factors and dietary/nutritional factors) were tested (Fig. 5), which together explained 3.6% of the variation in interindividual distance of the core OTU composition (Bray-Curtis distance). In the demographic/clinical factors which explained 2.4% of the variation, we observed associations of 9 items (i.e., sex, education, physical activity, diastolic blood pressure, blood glucose, blood lipids and medicine use for type 2 diabetes) with inter-individual distances in the core OTU composition (PERMANOVA, p < 0.05; Fig. 5). While in the dietary/nutritional category (1.1% variance was explained), only dairy consumption significantly contributed to the variance of the core OTU composition.