This retrospective study involves categorical investigation of genetic alterations in 22 MDS cases of the second highest populated region of South Asia (Pakistan) through deep massively parallel DNA sequencing using a targeted TruSight myeloid sequencing panel. This panel is used for detecting the somatic variations in genes commonly mutated in myeloid malignancies. The targeted coding and non-coding regions were covered equally where the median depth of coverage for non-coding and coding variants was 4999x and 4920x respectively. The low quality variants with parameters of QUAL < 50, DP < 30, and GQ < 20, were filtered out to minimize the potential variants due to sequencing artifacts. As a result, 265 variants in 44 genes were obtained, with an average of 77.09 variants (SD ± 7.39), and median of 75.5 variants per sample.
The genomic locations and their functional impact of the identified mutations were obtained by the annotation with ANNOVAR (detailed in Table 2). It was noted that the number of mutated non-synonymous (nonsyn) sites was higher than mutated synonymous (syn) sites, and the nonsyn/syn ratio was found as 1.15 which is higher than previously reported ratio of germline missense to silent variants in the South Asian populations [24]. For normalization and comparison, the nonsyn/syn ratio was also determined in PJL (Punjabi Lahore, Pakistan) healthy individuals of 1000 Genomes Project using the genetic variants within the same genomic regions as sequenced in this study. The ratio in healthy individuals was found as 0.88, which is 0.765 times the ratio in MDS cases of present study. Further analysis showed that the higher proportion of novel/rare nonsynonymous SNVs in present study MDS cases than in healthy individuals of 1000 Genomes Project was responsible for higher nonsyn/syn ratio in the study genes. There were 37 nonsynonymous SNVs either not present or had < 0.1% alternate allele frequency in 1000 Genomes and gnomAD_exome projects, whereas this number was 19 for synonymous SNVs. The higher nonsyn/syn in the MDS patients is persistent with previous reports [6, 25].
To explore potential deleterious impact of identified variants, emphasis was given to rare variants given the MDS is a rare disorder. The variants either not present or having alternate allele frequency < 1% in all the populations of public databases including 1000 Genomes Project and gnomAD_exome projects were retained. This resulted in 120 rare frequency mutational events (average 13.318 (SD ± 4.07) mutations per patient) in 38 genes including 02 stopgain, 42 nonsynonymous, 21 synonymous, 01 canonical splicing, 01 downstream,03 3’ untranslated region (UTR), and 34 intronic SNVs, and 07 frameshift insertions, 01 non-frameshift insertion, 01 non-frameshift deletion, and 07 intronic deletions (Supplementary Table S1). Furthermore, excluding the intronic, intergenic, synonymous, upstream/downstream and UTR mutations, there were 54 non-silent rare frequency mutations in 29 genes where three patients had one non-silent mutation and nineteen patients had more than one non-silent mutations, average 3.82 (SD ± 2.08) non-silent mutations per patient (Fig. 1).
Given that NGS was performed on the DNA isolated from peripheral blood containing both the normal leukocytes and blast cells, we applied a bit stringent unanimous cut-off 0.35 on variants allelic fraction (VAF) for all patients for discriminating probable somatic mutations from the germline. This probe decomposed 54 rare non-silent variants into 37 somatic non-silent mutations in 22 genes (Supplementary Table S2) and 17 germline non-silent mutations in 15 genes (Supplementary Table S3), representing multiple underlying mechanisms involved in pathophysiology of MDS in this cohort. Among the somatic mutations, 8 mutations were recurrent being found in more than one patients. There were 6 MDS cases containing one and 16 cases containing more than one somatic mutations each. Strikingly, it was noticed that the non-synonymous somatic mutation rs752628932 in highly conserved region (exon 8) of RAD21 (c.T815G; p.M272R) was present in 13 out of 22 cases (59%cases of this small cohort). The VAF of this substitution mutation was observed ranging from 0.172 to 0.262 indicating slightly variable time of origin in the patients. The other recurrent somatic mutations included nonsynonymous SNV c.A1564T:p.I522F in highly conserved region of STAG2observed in four cases, nonsynonymous SNV c.G1580T:p.C527F in the same conserved region of STAG2 observed in three cases, and nonsynonymous SNV c.G226A:p.A76T in CDKN2A observed in four cases. The two STAG2 mutations (p.I522F and p.C527F) were observed in different patients. Among the germline mutations, two mutations were recurrent found in three cases each. These included a protein truncating SNV c.C1894T: p.R632X in highly conserved region of ASXL3, and a nonsynonymous SNV c.T1604C:p.M535T in highly conserved region of KIT. There were 13 cases having both germline non-silent mutation and somatic non-silent variants, however, no statistically significant correlation was observed between the number of predisposing germline mutations and the somatic mutations within the cases (P > 0.05). For example, the MDS2 and MDS16 cases contained four and seven non-silent somatic mutations respectively, whereas these did not contain a predisposing germline non-silent mutation. Likewise, MDS3 and MDS4 had 3 and 4 germline predisposing mutations respectively, whereas both these cases had 3 somatic mutations each.
Filtration of the variants with ClinVar database highlighted the presence of four pathogenic variants associated with hematological neoplasms. These included a recurrent missense SNV rs121913250 (p.G12S) in highly conserved region of NRAS associated with acute myeloid leukemia and juvenile myelomonocytic leukemia, found in three cases; a frameshift insertion p.L160fs in NPM1, associated with myelodysplastic syndrome progressed to acute myeloid leukemia, found in one case; a missense SNV p.R730H in highly conserved region of Dnmt3a, associated with acute myeloid leukemia, myelodysplastic syndrome, lung adenocarcinoma, and inborn genetic diseases, found in one case; and a splicing SNV in CBL(exon8:c.1096-2A > T) found in one case. Furthermore, filtration in PharmGKB database showed the presence of a missense SNV rs1042522[G > C] in TP53 where GG genotype was found in two and GC in eleven cases. The GG and GC genotypes are associated with decreased response to cisplatin and paclitaxel chemotherapy.