Datasets
Our experiments employed a ‘Full’ dataset and an ‘Increased-Certainty’ dataset. The full dataset comprised 75,801 variants with ≥1 star, limited to ≥1 CLIA-certified laboratory submitters and without conflicting interpretations, spanning 3,115 different genes. Given that ClinVar does not represent a gold-standard database, a subset dataset of Increased-Certainty variants was also interrogated. Comprising 3,993 variants with ≥2 stars, across 638 different genes, variants in the Increased-Certainty dataset had ≥2 submitters with no conflicts or were EP-reviewed (https://www.ncbi.nlm.nih.gov/clinvar/docs/details/).
Variant classification concordance
Variant classification results are presented according to the five categories suggested by the ACMG guidelines, i.e., B, LB, VUS, LP, and P, as well as with application of VUS subclassification (to VUS-LB or VUS-LP) when sufficient supporting data were available. We assessed concordance using a two-tier classification based on medical importance, i.e., P/LP vs. B/LB/VUS, and also when excluding ClinVar’s VUS variants, as this category does not reflect the true classification of a variant.
When employing 2-tier classification, the aiVCE and ClinVar classifications were highly concordant, i.e., 97.29% (95% CI: 96.79%–97.79%) and 93.78% (95% CI: 93.61%–93.95%) for the Increased-Certainty and Full datasets, respectively (Tables 1 and 2).. Excluding variants classified as VUS in ClinVar did not strongly impact results, with 97.36% (95% CI: 96.70%–98.02%) and 93.26% (95% CI: 93.01%–93.51%) concordance in the Increased-Certainty and Full datasets, respectively.
Details of VUS subclassification discordance
The aiVCE’s ability to further classify VUS variants as VUS-LB or VUS-LP provides prioritization that could prove useful for the variant scientist. Specifically, the VUS-LB and VUS-LP variants would be those considered VUS according to the ACMG/AMP guidelines, but that have additional evidence available for consideration. Of the 58,067 Full dataset variants classified as VUS by the aiVCE, 7,282 (12.5%) were subclassified as either VUS-LP or VUS-LB. Discordance was observed for 91/7,282 (1.2%) variants called as VUS-LP by the aiVCE but that were LB in ClinVar and only 4/7,282 (0.05%) variants called as VUS-LB by the aiVCE but that were LP in ClinVar (Table 1).. When assessed using the Increased-Certainty dataset, in which case 314/2,961 (10.6%) VUS variants were subclassified by the aiVCE, only 1/314 (0.3%) subclassified variants demonstrated discordance with ClinVar (i.e., VUS-LP by aiVCE but LB in ClinVar) (Table 2)..
Variant classification by effect
When categorized by variant effects, missense variants comprised the largest group in both datasets (Full: 47.07%, Increased-Certainty: 47.02%), followed by synonymous variants (Full: 23.18%, Increased-Certainty: 30.14%); null variants including frameshift, stop gain, and splice donor/acceptor (Full: 13.70%, Increased-Certainty: 11.94%); splice region variants (Full: 8.25%, Increased-Certainty: 7.28%); intronic not located in a splice region and untranslated region (UTR) variants (Full: 6.34%, Increased-Certainty: 2.62%); and non-frameshift indel variants (Full: 1.69%, Increased-Certainty: 0.97%) (Tables 3 and 4).. As expected, the majority of the P/LP variants were null variants, the majority of VUS were missense variants, and the majority of B/LB variants were synonymous variants.
When concordance between the aiVCE and ClinVar classifications was assessed by variant effect, strong agreement was attained across all groupings, even in variants typically considered difficult to classify, e.g., missense and splice region variants. Specifically, respective levels of concordance in the Full and Increased-Certainty datasets were 85.74% and 99.51% for null variants, 90.27% and 95.30% for missense variants, 99.91% and 99.92% for synonymous variants, 98.90% and 100% for intronic not located in a splice region//UTR variants, and 96.58% and 97.59% for variants located in a splice region.
Variant discordance
In the Full dataset, 90.46% (21,914/24,223) of the ClinVar LB variants were classified as VUS by the aiVCE, and 21,103 (96.30%) of these variants were very rare (BA1/BS1/BS2 rules not applied) synonymous variants or variants located in non-coding regions, but not within the splice region. Given that additional case-specific evidence, based on the ACMG/AMP guidelines, would be required to classify such variants as LB rather than VUS, we further examined specific variants with conflicting classifications to delineate mechanisms of differentiation between ClinVar and the aiVCE. Utilizing the Increased-Certainty dataset to minimize the likelihood of misclassified variants and to limit the overall number of variants being assessed, we evaluated instances of discordance between the aiVCE and ClinVar using three tiers of classification (P/LP, VUS, B/LB).. Reassuringly, no ‘Strong’ conflicts, defined as cases where a B/LB ClinVar variant was classified as P/LP by the aiVCE, or where a P/LP ClinVar variant was classified as B/LB by the aiVCE, were observed.
‘Moderate’ conflicts, defined as cases in which a ClinVar P/LP variant was classified as VUS by the aiVCE, or vice versa, were observed for 107/3,993 (2.67%) of Increased-Certainty variants. Specifically, 49 variants were classified as LP by the aiVCE but as VUS in ClinVar, and 58 variants were classified as VUS by the aiVCE but as P/LP in ClinVar (see Additional file 1)..
For the moderately conflicted variants classified as LP by the aiVCE but that were VUS in CinVar, 46/49 were missense variants and were categorized as LP due to a number of supporting evidences, including PM1 (46/46 variants), PM2 (46/46), PM5 (20/46), PP2 (41/46), and/or PP3 (43/46) (Figure 2).. When the aiVCE evidences were considered against those provided in ClinVar, few instances of disagreement occurred for the PM2 (extremely low frequency in population databases) and PP3 (in silico predictions) rules. However, while the aiVCE detected these variants in mutational hotspots (PM1) in all cases, ClinVar annotations typically did not include a PM1 assessment. Additionally, a different amino acid change of a known P variant (PM5) was detected by the aiVCE, but commonly not reported by the ClinVar submitters (20/46 variants). As an example, for the variant NM_000257.3:c.4807G>A (p.Ala1603Thr) in the MYH7 gene, the PM1 rule was met, as the aiVCE identified 64 P/LP variants in that region located in the Myosin_tail_1 domain without a B variant. PM5 was met, as the variant c.4807G>C (p.Ala1603Pro) was previously classified as P in ClinVar; however, the evidence (PM5) was not annotated as such by ClinVar submitters.
For the alternate scenario of moderately conflicted variants, i.e., when a variant was classified as VUS by the aiVCE but was P/LP in ClinVar, 48/58 variants were missense variants (n = 41) or variants located in splice region (n = 7) (Additional file 1).. Manual examination of several variants, for which detailed classification information was available in ClinVar, indicated these variants were classified as P/LP based on additional evidence that was not available to the aiVCE, including patient-level data extracted manually from the literature (e.g., clinical information from the affected patient, de novo variant, segregation data) (Additional file 1).. For example, the variant NM_004863.3:c.547C>T in SPTLC2 gene (p.Arg183Trp) was called as P, as it has been reported to segregate with autosomal dominant hereditary sensory and autonomic neuropathy type 1C in two families (https://www.ncbi.nlm.nih.gov/clinvar/variation/487224/). Given that this information is not available to, and associated rules are not applied automatically by, the aiVCE, the classification remained VUS. Of note, the aiVCE subclassified 34/58 variants as VUS-LP, suggesting a greater likelihood the variant is P/LP.
In addition, seven of the 58 moderately conflicted variants classified as VUS by the aiVCE but as P/LP in ClinVar were null variants. Five of these null variants were splice donor/acceptor variants called as PVS1_Moderate based on the recent ClinGen EP recommendations,16 as the reading frame was not disrupted and the altered region was not known to be critical to protein function. For the remaining two null variants, although they were called as PVS1_Very Strong, their frequency was above their specific gene threshold to meet the PM2 rule. Specifically, for NM_012144.3:c.389–1G>C (p.Gly134Arg), the aiVCE suggested a threshold of 0.00111 for applying PM2 in the DNAI1 gene, yet due to a frequency of 0.00128 in the gnomAD (https://gnomad.broadinstitute.org) ‘Other’ population, the PM2 rule was not met. For NM_199292.2:c.457C>T (p.Arg153Ter), the suggested frequency for applying PM2 for the TH gene exceeded 0.0005, yet it appeared at a frequency of 0.0007 in the gnomAD East Asian population. The observed frequency for both variants was very close to their gene-specific suggested PM2 threshold. Although these variants did not meet the PM2 rule, they were still below suggested thresholds for applying BS1, owing to the fact that the aiVCE has an uncertain frequency region for determining PM2 or BS1 application in cases where no rule is applied.
Of particular note, five discordant variants, which were classified as VUS by the aiVCE and LP in ClinVar, occurred in MSH2 and MLH1 genes that were later (based on ClinVar13 version 2019/1) reclassified in ClinVar by The International Society for Gastrointestinal Hereditary Tumors (https://www.insight-group.org/) from LP to VUS (https://www.ncbi.nlm.nih.gov/clinvar/23203244/).
Variants and ACMG rules
The distribution of ClinVar variants, according to the aiVCE application of ACMG rules and ClinVar submitter classifications, is shown in Figure 3. A significant difference (p<0.0001) was observed for use of P-supporting rules, as well as for application of B-supporting rules, to P/LP vs. B/LB variants.
When considering gene-specific rules for missense variants (Figure 4a), which are among the most difficult variants to classify, the aiVCE differentially applied P-supporting (PS1, PM1, PM5, PP2, PP3) rules to P/LP variants, and B-supporting (BP1) rules to B/LB variants (P<0.00001 for each rule). Further, application of the PVS1 rule was significantly different between P/LP vs. B/LB LOF variants (P<0.00001) (Figure 4b)..
The aiVCE effectively differentiated between P/LP vs. B/LB variants for both missense variants (Figure 4a) and variants located in splice regions (Figure 4c).. The PP3 rule was applied for 62.18% of the missense and 76.19% of the splice region P/LP variants but for only 7.0% and 1.6% of the missense and splice region B/LB variants, respectively (P<0.00001). Similarly, the BP4 rule was applied for 60.7% and 66.7% of the B/LB missense and splice region variants, respectively, but for only 9.1% and 4.7% of the corresponding P/LP variants (P<0.00001). As expected, the BP7 rule was applied to all synonymous variants (Figure 4d).. While B-supporting frequency rules were applied to more B/LB than P/LP variants, the difference was not significant.
When assessing the aiVCE’s application of ACMG frequency-related rules (BA1, BS1, BS2, PM2), significant differences were observed between ClinVar P/LP vs. B/LB missense, LOF and splice region variants (P<0.00001 for each rule), thus supporting the gene-specific thresholds selected for these rules. Given that more rare (and thus difficult to classify) than common variants are submitted to ClinVar, it is not surprising that the aiVCE applied the PM2 rule to >50% of variants within each type of variant effect and for 99.67% and 83.18% of the ClinVar P/LP vs. B/LB variants, respectively (Figure 4)..
aiVCE classification performance across disease categories
To examine the robustness of the aiVCE across different disease groups, the following six gene panels from the Genomics England PanelApp (https://panelapp.genomicsengland.co.uk/) were employed: RASopathies, Hereditary Ataxia, Familial Breast Cancer, Hereditary Neuropathy, Hearing Loss, and Confirmed FA/BS. Utilizing only variants of genes in each panel from the Full dataset, concordance was consistently high across the disease categories evaluated, i.e., 92.67%–98.40% (Table 5)..
Also utilizing ClinVar P/LP variants of genes from each panel in the Full dataset, we assessed the performance of frequency-related rules across varying disease mechanisms. The PM2 rule (extremely low frequency in population databases) was met in >99.7% of the P/LP variants, the BS1 rule was not met or met for only 1–2 (<0.1%) of the P/LP variants, and the BA1 rule was never met for P/LP variants (Figure 5),, indicating that gene-specific and varied gene frequencies can be incorporated into the AI.
Specific to PM2 thresholds used for the genes in the different disease panels (Figure 6),, RASopathy genes with a dominant inheritance model had a low threshold (mean [SD]: 0.0005 [0.0]), while genes with a recessive inheritance model and/or lower penetrance had much higher thresholds, e.g., FA/BS mean (SD): 0.0021 (0.0023) and Familial Breast Cancer mean (SD): 0.0019 (0.003). On average, the PM2 threshold was <0.0015 across disease categories.
When assessing the distribution of CinVar P/LP variants by aiVCE rule application, differences across disease categories were observed for application of missense/LOF-related rules (i.e., PVS1, PS1, PM1, PM5, PP2, PP3), as expected, owing to the different disease mechanisms represented by each gene panel. For example, while the PSV1 rule was met for 96.50% of the variants in the FA/BS panel and 93.11% of Familial Breast Cancer variants, it was met for only 72.00% of RASopathy, 80.45% of Hearing Loss, 76.87% of Hereditary Neuropathy, and 81.86% of Hereditary Ataxia variants. Conversely, the PS1, PM5, and PP2 rules were met for significantly higher proportions of RASopathy variants when compared with other gene groups (all P<0.00001), and the PM1 rule was met for 11.39% of Hereditary Neuropathy but only 47.69% of FA/BS variants (P<0.00001) (Figure 5)..