Study overview
The workflow of our study is summarized in Figure 1. Starting with the 34,519 GWAS compiled by the IEU Open GWAS project, we focused on the 14,422 that are based on European-descent samples, in order to match the major ancestry in the GWAS of COVID-19 and to avoid false positives as results of population discrepancy in genetic effects (Supplementary Tables 2 and 3). Genetic instruments were selected from each GWAS as independent genetic variants at the genome-wide significance, and their effect alleles were harmonized with the outcome GWAS. Three or more genetic instruments are required for statistical tests of pleiotropic effects, and thus exposures with fewer instruments were excluded. For the univariable MR analysis of each exposure-outcome pair, we first applied the inverse variance-weighted (IVW) method with a multiplicative random-effects model30. We then evaluated the possible presence of pleiotropic effects with Cochran’s Q test of heterogeneity and the MR-Egger intercept test for directional pleiotropy31-33. We excluded all exposures with indications of pleiotropy in their genetic instruments to fullfill the key assumptions underlying MR analysis. We retained 6,442 GWAS for the discovery analysis with the HGI A2 study (Supplementary Table 4), 6,407 GWAS for the replication analysis with the HGI B2 study (Supplementary Table 5), and 6,248 GWAS for the replication analysis with the NEJM study (Supplementary Table 6). The false discovery rate (FDR) approach was utilized to correct for multiple testing of many exposures (Supplementary Tables 7-9). Based on these three sets of analysis, we defined two sets of results: 1) the significant and replicated results, which have a q-value < 0.05 in the discovery analysis and a nominal p-value < 0.05 in either one of the replication studies (Supplementary Table 10); and 2) the suggestive and replicated results, which have a nominal p-value < 0.05 in the discovery analysis and a nominal p-value < 0.05 in either one of the replication studies (Supplementary Table 11). A total of 49 significant and replicated traits were identified. Among them, 17 were replicated in both replication datasets (Table 1, Supplementary Table 12).
BMI-related traits
In the univariable MR study, eight BMI-related traits are positively associated with severe COVID-19 in our discovery analysis and also in both of our replication analyses (Table 1). Genetically predicted one standard deviation (SD) increase of BMI is associated with a higher risk of severe COVID-19 (OR: 1.89, 95% CI: 1.51–2.37, p = 1.78 × 10−6). Consistent with BMI, genetically instrumented higher hip circumference (OR: 1.46, 95% CI: 1.15–1.85, p = 0.0017) and waist circumference (OR: 1.82, 95% CI: 1.36–2.43, p = 6.20 × 10−5) are associated with a higher risk of severe COVID-19. The univariable MR study also provided strong evidence that weight and fat mass in the left arm, right arm, left leg, right leg, trunk, and whole body are positively associated with severe COVID-19 (Supplementary Tables 10 and 11).
To pinpoint the different aspects of BMI-related traits, we investigated the roles of fat mass and fat-free mass indices in severe COVID-19 (Supplementary Table 13). In the multivariable MR analysis controlling for fat-free mass, there is strong evidence for direct causal effects of fat mass measured at different body parts, including the whole body, left and right arms, left and right legs, and the trunk. The evidence is consistent across the three GWAS of COVID-19 (Fig. 2, Supplementary Table 13). On the other hand, there is no evidence for direct causal effects of fat-free mass (Fig. 3, Supplementary Table 13). The multivariable MR analysis results indicate that the causal effects of BMI-related traits on severe COVID-19 are mainly driven by fat mass.
White blood cell traits
In the univariable MR analyses, we identified a group of five white blood cell traits to be negatively associated with the risk of severe COVID-19. Specifically, suggestive associations were determined for neutrophil count (OR: 0.76, 95% CI: 0.61–0.94, p = 0.013), sum basophil neutrophil counts (OR: 0.71, 95% CI: 0.57–0.87, p = 0.001), sum neutrophil eosinophil counts (OR: 0.76, 95% CI: 0.61–0.95, p = 0.015), myeloid white cell count (OR: 0.77, 95% CI: 0.62–0.96, p = 0.0197), and granulocyte count (OR: 0.75, 95% CI: 0.601–0.93, p = 0.009) (Fig. 4). For all five traits, causal estimates are broadly concordant in weighted median (WM) and weighted mode methods, and consistent directions of effects were also found by the MR-Egger method (Supplementary Table 11). Take neutrophil count as an example, consistent estimates of a protective effect were found with WM (OR: 0.61, 95% CI: 0.42–0.88, p = 0.009) and weighted mode (OR: 0.59, 95% CI: 0.39–0.91, p = 0.017). Overall, our findings support the causal roles of white blood cells, especially neutrophils, in reducing the risk of developing severe COVID-19.
Circulating proteins
Our univariable MR analyses revealed evidence of causal effects for some circulating proteins. There are six proteins whose effects on severe COVID-19 are significant in the discovery MR analysis (q-value < 0.05) and also replicated in both replication analyses (p-value < 0.05) (Table 1, Supplementary Table 12). Three of them are negatively associated with the risk of severe COVID-19, including interleukin-3 receptor subunit alpha (OR: 0.87, 95% CI: 0.79–0.94), interleukin-6 receptor subunit alpha (OR: 0.88, 95% CI: 0.83–0.94), and prostate-associated microseminoprotein alpha (OR: 0.71, 95% CI: 0.58–0.86). The other three are risk-increasing, including zinc-alpha-2-glycoprotein (OR: 1.37, 95% CI: 1.14–1.66), C1GALT1-specific chaperone 1 (OR: 1.20, 95% CI: 1.19–1.21), and corneodesmosin (OR: 1.12, 95% CI: 1.09–1.16). There are another six circulating proteins that have significant and replicated effects on severe COVID-19, although they are only replicated in one replication analysis (Supplementary Table 10): inter-alpha-trypsin inhibitor heavy chain H1 (OR: 1.08, 95% CI: 1.04–1.12), alpha-2-macroglobulin receptor-associated protein (OR: 1.14, 95% CI: 1.07–1.23), resistin (OR: 1.09, 95% CI: 1.07–1.11), reticulon-4 receptor (OR: 0.86, 95% CI: 0.79–0.93), C-C motif chemokine 23 (OR: 0.88, 95% CI: 0.83–0.92), and collectin-10 (OR: 0.83, 95% CI: 0.76–0.901). Additionally, our suggestive and replicated results revealed another 14 proteins to be associated with the severe COVID-19 risk (Supplementary Table 11). Overall, our MR analyses prioritized scores of circulating proteins that are likely causal in the development of severe COVID-19.