The study was designed to determine the carrier frequency of single gene disorders other than β-thalassemia for which has a carrier frequency of about 3- 4 % has been shown in many studies in India [52]. Disorders such as spinal muscular atrophy (SMA), fragile X syndrome (FXS) and Duchenne muscular dystrophy (DMD) are common in all populations including South Asians were also excluded as these are difficult to detect with NGS [53]. Recently, we showed the carrier frequency of SMA in North India to be 2.25 % [23]. However, carrier frequency for other single gene recessive disorders is not known and significant differences in prevalence and pathogenic variants have been seen in different populations [54].
CFTR (Cystic fibrosis transmembrane conductance regulator) pathogenic and likely pathogenic variants
There were nine disease-causing variants identified in the CFTR gene in this cohort. Of these, only one case had the common p.Phe508del pathogenic variant i.e. 11% (n=1/9). Two pathogenic variants detected in CFTR gene in this study have been observed before in our laboratory (p.Arg75Ter and p.Ser549Asn). The remaining six pathogenic variants have not been reported in Indians earlier (Table 2). The variants p.Ser549Asn, p.His199Tyr, p.Arg1070Gln have been described by multiple authors and functional studies have been carried out classifying them as pathogenic as per ACMG criteria. The other four variants p.Ile1366Phe, p.Cys491Phe, p.Phe1337Val, p.His620Leu have been documented to be associated with disease, however lack adequate functional studies and cannot be classified as pathogenic, although they meet the ACMG criteria for likely pathogenicity (Table 2).
CFTR c.3854C>T, p.Ala1285Val variant was identified in three individuals, which though has been reported in the literature [55] associated with congenital bilateral absence of vas deferens CBAVD), is more likely to represent a common polymorphism due to its observance in high frequency in the NGS data in the Indian population (0.5% minor allele frequency in South Asians in gnomAD exomes). This variant was classified as VUS and not included in the list of disease associated variants.
Studies on the genetic profile of cystic fibrosis patients in India shows high variability, and many rare and new variants have been observed, while only few pathogenic variants (p.Arg1162Ter, p.Met1Thr, c.1161delC, p.Ser549Asp and c.1525-1G>A) are reported more than once [56, 57, 58]. This suggests the lack of founder or common mutations in CFTR gene and thus emphasises the need for sequencing of all coding regions of the CFTR gene in suspected cases in the Indian population. In the present study except for p.Phe508del no other pathogenic variant was present in the ACMG panel of cystic fibrosis [59]. In view of the heterogeneity in pathogenic variants, Mandal et al. also suggested that a single panel of pathogenic variants cannot be used for diagnosis or carrier testing of CF in India [28]. Archibald et al. also observed that the pathogenic variants in cystic fibrosis vary according to ethnic origin [53]. Lim et al. reported in ExAC database that the pathogenic variants in the CFTR in non-Europeans are different from those in people of European descent. They noted that none of the current genetic screening panels or existing CFTR pathogenic variant databases cover a majority of deleterious variants in any geographical region outside of Europe [60].
Among the nine disease causing variants identified in the CFTR gene in the present cohort, only one case had the common p.Phe508del pathogenic variant i.e. 11% (n=1/9). Kapoor and Kabra et al. studied cord blood samples of 955 newborns and reported a p.Phe508del carrier frequency of one in 238 (0.42%) [21]. They estimated the frequency of homozygous p.Phe508del as 1/228,006. However, this cannot be considered to represent the true prevalence of cystic fibrosis in India as it took into account only one pathogenic variant. Comparison of p.Phe508del allele frequency with that reported from the West shows that Indians have a low frequency (19-44%) of the p.Phe508del pathogenic variant [61, 62, 63]. Cystic fibrosis was thought to be extremely rare in India. However, a growing number of publications in the last two decades have suggested a higher prevalence [28, 63, 64]. This indicates that CF may be much more common in the Indian population with majority of cases being missed or undiagnosed. CFTR related pathogenic variants may be rarely recognized in Indians in view of the different phenotypes (including cystic fibrosis and congenital absence of vas deferens), variable clinical severity and lack of availability of sweat testing, and absence of new born screening.
GJB2 c.231G>A, p.Trp77Ter and c.71G>A, p.Trp24Ter
Biallelic variants in the GJB2 gene or deletion in the gene cause congenital nonprogressive mild to profound sensorineural hearing impairment. The pathogenic variants identified in GJB2 represent have been previously reported in Indian subjects. Ram Shankar et al studied the pathogenic variants in GJB2 gene in Indian patients with deafness and found p.Trp24Ter to be the most common pathogenic variant India [22]. In addition, they documented two other common pathogenic variants p.Trp77Ter and IVS1+1G>A. These differ from the common pathogenic variants identified in the Western (c.35delG) [65] and Japanese (c.235delC) and Korean (p.Val37Ile) populations [66, 67].
SLC26A4 related hearing loss
Hearing loss due to SLC26A4 has been reported as third most common cause of hearing loss in a study in a pan-ethnic population [68]. This occurs due to an enlarged vestibular aqueduct and temporal bone abnormalities which can be appreciated on imaging. In addition to hearing loss, these individuals may have euthyroid goitre (Pendred syndrome). In this study, two out of the four disease - causing variants reported have been previously described in individuals of Indian ethnic origin: p.Arg409Pro [69,70] and p.Ile490Leu [71]. Other variants found in our study include p.Gly334Val, that has been described chiefly in people of Mediterranean origin [72] and p.Phe335Leu which is a common variant reported worldwide [73].
Carrier screening and prenatal diagnosis for a disorder like hearing loss which impairs quality of life can have differing perceptions among families in different countries. The parental perceptions in Indian culture where resources are scarce towards congenital hearing loss have been pointed out by Nahar et al. previously [74]. While some families are interested in using the information to help in the management, planning and emotional adjustment to the birth of a child with deafness others opt for discontinuing an affected fetus especially if financial resources are scarce.
GBA c.1448T>C, and c.866G>C, p.Gly289Ala
Biallelic variants in the GBA gene causing a deficiency of acid β-glucosidase and cause Gaucher disease, the most common lysosomal storage disorder in the world [75]. The variant p.Gly289Ala and p.Leu483Pro were observed in one individual in the present cohort. Ankleshwari et al. studied 33 Indian patients with Gaucher disease, and identified p.Leu483Pro as the most common pathogenic variant 60.6 % (n=20/33). In addition, they reported p.Gly289Ala as a novel pathogenic variant in a patient with type I disease [76]. Homozygosity for the p.Leu483Pro variant is associated with neuronopathic involvement (type III) ranging from mild oculomotor apraxia to more severe involvement as well as lethal cases of collodion skin baby phenotype [77,78]. The variant most commonly observed in Western population (p.Asn370Ser) and associated with type I Gaucher disease is observed less commonly in India [77, 79].
GAA c.1933G>A, p.Asp645Asn variant
Biallelic pathogenic variants in the GAA gene cause deficiency of acid α-glucosidase resulting in Pompe disease. We observed three individuals to be carriers for p.Asp645Asn variant in the GAA gene. This variant was reported for the first time in 1998 by Huie et al. and demonstrated low enzyme activity with this pathogenic variant in vitro and in vivo [80]. Subsequently this pathogenic variant has been reported in patients affected with infantile onset Pompe disease in several studies [81]. This variant lies in exon 14 of the gene, reported to be a hot spot for this gene [81]. However a study done on Indian ethnic patients reported no hot spots for this gene [82].
OCA2 c.1580T>G, p.Leu527Arg variant
Oculocutaneous albinism type II (tyrosinase positive) is caused by biallelic pathogenic variants in the OCA2 gene. These individuals acquire small amounts of pigment with age and tend to have less severe visual abnormalities. The p.Leu527Arg variant was observed in heterozygous in two individuals in our cohort. It was reported for the first time by Jowerek et al. in a Pakistani family with some pigmentation of hair [83]. They reported that this pathogenic variant lies in highly conserved residue of amino acids in the transmembrane 8 domain of the protein and segregated with affected member.
AGXT c.302T>C, p.Leu101Pro variant
Primary hyperoxaluria occurs due to deficiency of the liver peroxisomal enzyme alanine:glyoxylate-aminotransferase encoded by the AGXT gene. We observed one carrier (belonging to Punjabi community) for p.Leu101Pro variant in our cohort. This variant was reported for the first time by Williams et al [84], who demonstrated that the mutant gene protein had less than 1% of normal activity in vitro. Subsequently, Chanchlani et al. documented three patients with primary hyperoxaluria type 1 to have the p.Leu101Pro variant in homozygous state [85]. All the three patients belonged to north India or Pakistan. They suggested a possibility of this being a founder pathogenic variant in India although larger studies and haplotype analysis are required.
ASPA c.902T>C, p.Leu301Pro
The ASPA gene encodes for aspartoacylase enzyme, deficiency of which results in Canavan disease. One individual was found to be carrier for the p.Leu301Pro variant. This variant has been reported by our group previously in a patient of Indian ethnicity with classical Canavan disease and raised urine N-acetyl aspartate [86]. On the basis of the reported literature this variant has classified using ACMG criteria as likely pathogenic.
ACADM c.811G>A, p.Gly271Arg
Biallelic pathogenic variants in ACADM affect mitochondrial fatty acid β-oxidation due to deficiency of the enzyme medium-chain acyl-coenzyme A dehydrogenase. The p.Gly271Arg is a well reported pathogenic variant in the ACADM gene worldwide. It was observed in one individual in this study. The c.985A>G pathogenic variant commonly seen in the West, believed to be a founder pathogenic variant in Caucasians originating from an ancient Germanic tribe was not observed in the present cohort [87].
Disorders like AR polycystic kidney disease, methyl malonic acidemia, galactosemia, Smith-Lemli Opitz syndrome, oculocutaneous albinism type II, cystic megalencephalic leukoencephalopathy, phenylketonuria and junctional epidermolysis bullosa can be expected to be common in the Indian population as at least two cases were detected among the 200 individuals screened.
Other investigators and our group have identified a number of disorders with founder mutations among the Agarwal community [88, 89]. Carriers for only two of these were identified in the current panel of genes - calpainopathy and megalencephalic leukodystrophy with cysts. The mutations detected are not the common ones noted in the Agarwal community. However there were only 28 individuals in the cohort belonging to the Agarwal community and larger studies are indicated to determine their carrier frequency.