In the present article, we attempted to elaborate on the spectrum of risk variants and genes identified in different ways and their possible relationship to COVID-19 severity and/or mortality. We investigated the frequencies of these variants and evaluated their possible role using a cohort of 27 Slovak patients who have died of COVID-19. In our previous studies, we described the re-use of the data from NIPT for genome-scale population-specific frequency determination of small DNA variants [45], CNVs [46], and variants associated with colorectal cancer and Lynch syndrome [47]. Therefore, we assumed that NIPT data can be utilized as a control group in this population study of COVID-19. As a second control group, we chose genetic data of NFE, which contains a total of 125,748 exomes and 71,702 genomes. To our knowledge, the present study is the first population analysis of COVID-19 variants worldwide and also in the Slovak population that provides different approaches to the analysis of genetic variants in WES data from patients who have died of COVID-19.
Over the past two years, GWAS has offered the opportunity to uncover genetic susceptibility factors for COVID-19 disease and provide insights into the biological basis of SARS-CoV-2 etiology. To date, a large number of risk genetic variants and genes have been identified by the GWAS approach, which has been intimately connected to the COVID-19 susceptibility and severity [32–35, 48]. However, after merging all identified risk COVID-19 variants from GWAS Catalog with our WES data of dead patients, we identified only 3 common variants - a missense variant rs11147040, synonymous variant rs72472161, and coding sequence variant rs8176719. Other identified variants belonged to the intron, non-coding, and UTR variants group. Due to this limitation, we have found that risk variants from the GWAS Catalog are not useful for analyzing and comparing our data obtained from WES.
In 2021, a large consortium organized highly expected studies published last year. The COVID-19 HGI presented results from three genome‐wide association meta‐analyses of up to 49,562 COVID‐19 patients from 46 studies across 19 countries [49]. They reported 13 genome‐wide significant loci. The 3p21.31 region seemed to be associated with infection susceptibility, which was also confirmed in study by Ellinghaus et al. This study also confirmed a potential involvement of the ABO blood-group system [32]. Similar results were also found in a study conducted by 23andMe company using their biobank. After the Bonferroni correction in the analysis of COVID-19 missense risk variants conducted by the COVID-19 HGI Browser, we identified 5 variants with a significantly different representation; a missense variant rs3130984 located in the CDSN gene and four variants (rs8176747, rs8176746, rs8176743, and rs7853989) all located in the ABO gene. Two missense variants, rs8176747 and rs8176746, were found in the comparison between WES and NFE data in the C2 HGI group and the B group in the analysis by literature search. Recently, GWAS found COVID-19-association signal at locus 9q34.2 coincident with the ABO-blood group (rs8176747, rs41302905, rs8176719) in Italian and Spanish severe COVID-19 patients with respiratory failure [32, 50]. Another study published a genetic hypothesis on the role of RAS-pathway genes (ACE1 and ACE2) and ABO-locus (rs495828, rs8176746) in COVID-19 prognosis, suspecting inherited genetic predispositions to be predictive of COVID-19 severity [51].
Since 2017, the QIAGEN Clinical Insight (QCI™) Analyze software has offered itself as an integrated clinical decision support solution for the annotation, interpretation and reporting of NGS data. QCI seeks to extract clinical significance and actionability from sequencing data [52]. After a literature search of studies using QIAGEN Clinical Insight (QCI™) Analyze software, we found only a few studies, mainly focused on cancer research [52–54]. As we expected, a comparison of the QCI results showed an overlap in the number of risk variants found in the QCI critical group with the variants in groups B. and C., publications associated with the severity and mortality of COVID-19. However, we did not identify any overlap with group A.; the publication included cellular host factors for SARS-CoV-2 infection. In the B1 HGI group, no overlap was found in the number of variants with the QCI infection and critical group, which may be due to the relatively small number of missense variants found (only 13 risk variants). In the C2 HGI group, we expected to overlap with both groups, but we only identified with the QCI critical group. Although using QIAGEN Clinical Insight (QCI ™) Analyze software requires further optimization, this software may be a promising diagnostic, prognostic, or predictive tool in future research of COVID-19.
Our study has several key shortcomings. First, although we used different approaches to analyze WES data of patients who died of COVID-19, we were able to identify only 10 risk variants with significant difference (after Bonferroni correction) in allele distribution of WES data and data of Slovak NIPT and NFE. Next, the sample size of dead patients was relatively small. Moreover, data from NIPT are strongly biased towards the healthy population of females. It should also be noted that the total number of analyzed risk variants identified by COVID-19 HGI was also relatively small as we focused only on missense variants. We focused on missense variants to identify as many risk variants as possible since most of the identified variants from HGI belonged to the group of non-coding, which in GWAS analysis proved to be useless for the analysis of our WES data.