aCGH is a molecular cytogenetic technique capable of locating CNVs related to submicroscopic gains or losses at the chromosomal level, which has progressively escalated, becoming a valuable tool in the diagnosis of various conditions, displacing conventional methods as a first-line study for diagnosing patients with congenital anomalies, neurodevelopmental disorders, and even in the study of male infertility, providing faster detection of microdeletion and microduplication syndromes (Table 4) (6, 14, 15).
Table 4
Uses of comparative genomic hybridization (2–4)
Microarray chromosome
|
Type of alterations
|
Resolution
|
Examples of clinical indications
|
Comparative genomic hybridization (aCGH)
|
Aneuploidies, chromosomal rearrangements, copy number variants at the level of genes or exons associated with unbalanced structural changes
|
Based on design, usually single exon resolution for genes of interest
|
- As part of a phenotype-specific panel
- As a complement to exome sequencing
|
aCGH a molecular cytogenetic technique capable of locating CNVs related to submicroscopic gains or losses at the chromosomal level. |
Gene3D: 3.30.40.10 Zinc/RING finger domain, C3HC4 (zinc finger) of NSD1 gene (Nuclear receptor SET domain-containing protein 1), UniProt A0A3G1LEI2 with 541 amino acids, family and domain databases. |
STITCH: stronger associations are represented by thicker lines. Protein-protein interactions are shown in grey, chemical-protein interactions in green, and interactions between chemicals in red. Network nodes represent proteins: splice isoforms or post-translational modifications are collapsed, i.e., each node represents all the proteins produced by a single, protein-coding gene locus. Small nodes indicate proteins of unknown 3D structure, while large nodes indicate that some 3D structure is known or predicted |
GeneMANIA reported Networks: Co-expression: 100%: Showing 20 related genes, with 26 total genes, 0 attributes, and 33 total links |
Deletions of varying sizes that span a large genomic region often pose a challenge in understanding the precise role of specific loci among multiple genes in the onset of observed phenotypes (16). The deletion of one gene may be compensated for or exacerbated by the deletion of another gene, resulting in unpredictable phenotypic effects. A microdeletion could confer a severe phenotype due to the involvement of causal mechanisms such as haploinsufficiency of more than one gene (16, 17).
The expression of nearby or related genes can be directly altered or through regulatory mechanisms, which can have a cascading effect on affected biological pathways. The positional effect on the regulation of gene expression flanking the microdeletion is an informative indicator of the positional effect of the studied deletions (17, 18). The homozygosity of these regulatory genes can contribute to or modify the phenotypic characteristics (16)
Due to the complexity of genetic and molecular interactions, the deletion of multiple genes can manifest very variably in different individuals, even if they share the same deletion. (19, 20). The deletion of multiple genes can expose unexpected genetic interactions, due to the complex relationship between genes and regulatory elements within deletions. Individual genetic backgrounds may modulate the final clinical outcome of deletions in a specific region (16).
This pleiotropy could be due to the activity of proteins encoded by the remaining alleles or by compensatory mechanisms. Additionally, it is possible that the encoded genes do not equally contribute to the phenotypic characteristics or that the genes contained in the deletion lower the threshold for the expression of genetic variation in other parts of the genome (21).
According to the guidelines of the American College of Medical Genetics (ACMG) and the Clinical Genome Resource (ClinGen), alterations detected by CMA are classified as pathogenic if associated with diseases; likely pathogenic if evidence suggests an association with a disease, but additional evidence would clarify the variant's pathogenicity better; uncertain significance if there is insufficient information to consider it benign or pathogenic, in which case conducting parental studies to obtain additional information is useful to elucidate its pathogenicity; likely benign if current information does not suggest an association with a disease, but greater evidence would better explain such a condition; and benign if they are not related to any disease(4, 14, 22).
Regarding neurological disorders of unknown etiology, it has been reported that aCGH has an identification capacity between 15% and 20%, which varies depending on the symptoms, the cohort, and the type of test used (6, 23, 24), Particularly for epilepsy with global developmental delay, intellectual disability, and Autism Spectrum Disorder (ASD), aCGH is increasingly useful for establishing diagnosis and detecting new susceptibility regions (4, 25). These results have a strong association with associated comorbidities, where patients with syndromic conditions are more likely to present a CNV than those who do not have them, with cardiovascular and craniofacial defects being the ones that contribute most to the diagnostic possibility of CNV in patients with epilepsy or ASD (6).
In these neurological disorders, the interaction of genetic and environmental factors poses a significant challenge in etiological diagnosis, with the transcriptional component known as a key player in neurodevelopment and genetic etiology accounting for 25 to 50%. This includes single nucleotide variations, structural variants, and CNVs; the latter comprising deletions, insertions, and duplications due to errors in DNA replication or repair, which are the main causes of these disorders. They can generate different neurological development phenotypes depending on the size, number of affected genes, or compromised breakpoints, with an estimated quarter of neurological clinical manifestations being explained by CNVs of more than 400kb (5, 6, 26).
It has been found that variants and deletions in the 5q35 region trigger different polymalformative syndromes, which vary depending on the extent of involvement of the long arm of chromosome 5 in its terminal zone. These syndromes manifest with alterations in cognitive functioning, adaptive behavior, and behavior, as well as excessive growth, advanced bone maturation, and neurological involvement with hyperreflexia and hypotonia. (27, 28). Additional involvement at the level of the nervous system with agenesis of the corpus callosum, cardiovascular system with persistent ductus arteriosus and atrial septal communication, and urinary system with hydronephrosis and vesicoureteral reflux were only reported in those with deletions. (29).
In the publication by Loeza et al. (30)reported on a 4-year-old patient with Sotos syndrome with a deletion 5q35.2-q35.3 [(175,571,962 − 177,422,761]x1) of 1.851 Mb containing 43 genes. González-Rodríguez et al. (31) reported a heterozygous deletion in 5q35.2-q35.3 (175580042–177386153) of 1.806 Mb in a patient with Sotos syndrome and nephrocalcinosis. Lin et al. (32) reported a case of a 5-year-old girl with macrocephaly, high and broad forehead, developmental delay with a 1.86 Mb deletion in 5q35.2-q35.3 ([175559343_177422760]x1) associated with Sotos syndrome.
Deletions in chromosome 5q35.2-q35.3 lead to a complete loss of the NSD1 gene (Nuclear Receptor-Binding Set Domain Protein 1), with haploinsufficiency of this gene being the main cause of Sotos syndrome (OMIM #117550), a disease with a prevalence of 1 in 15,000. It is characterized by prenatal and postnatal overgrowth with advanced bone age, gestalt facies (round face, hypertelorism, prominent forehead with frontotemporal alopecia, high-arched palate, downward slanting palpebral fissures, large ears, and pointed chin), macrodolichocephaly, and learning difficulties (30, 33, 34). Other characteristics include neurodevelopmental delay, psychomotor difficulties, seizures, behavioral alterations, neonatal jaundice, cardiorenal malformations, and scoliosis (30).
While this syndrome presents with an autosomal dominant inheritance pattern, approximately 95% of patients have de novo variants (30). On the other hand, these patients present an increased risk of neoplasms such as Wilms tumor, acute lymphoblastic leukemia, neuroblastoma, and sacrococcygeal teratoma, conditions that may be related to the alteration of NSD1, which is part of the NSD family, proteins that participate in chromatin integrity, whose variants at this level are associated with a variety of cancers, or due to the involvement of the FGFR4 gene (fibroblast growth factor receptor 4, OMIM #134935), which has been linked to cancer progression and the presence of metastasis (35, 36).
In the present case, it is found that the deleted region contains 12 genes with associated pathologies, where 5 of them (DDX41, F12, NSD1, SLC34A1, SNCB) have an autosomal dominant inheritance mechanism, and the FGFR4 gene without a defined mechanism but related to cancer progression/metastasis. Therefore, it is necessary to delve into the possible gene interactions contributing to the patient's phenotype.
Understanding reverse phenotyping as an approach that allows for an in-depth evaluation of unusual phenotypes, where genetic heterogeneity complicates the ability to conclude about genotypic conditions without having a substantial number of participants, avoids the inconveniences generated by inconsistent data collection. This allows for more specific and consistent phenotypic evaluations in a set of individuals with a variant. Such genotype-disease correlation expands the clinical spectrum of a known association and enables ex vivo analysis of a trait or disease, serving as a model for predictive genomic medicine (37).
Consequently, genomic determination can provide a more comprehensive picture of the pleiotropy of a genetic variant compared to phenotypic determination, without necessarily excluding one from the other. When a variant is predicted to be pathological through in silico models or when a genotype-disease association is postulated based on phenotypic determination research, reverse phenotyping can increase information about the pathogenicity of the variant in phenotypically unselected populations (37).
Interactions between proteins and small molecules are an integral part of biological processes in living organisms. Information on these interactions is dispersed over many databases, texts and prediction methods, which makes it difficult to get a comprehensive overview of the available evidence.
The NSD1 gene (Nuclear receptor SET domain-containing protein 1), UniProt A0A3G1LEI2(38), contains 541 amino acids (Fig. 3) with family and domain databases: Gene3D: 3.30.40.10 Zinc/RING finger domain, C3HC4 (zinc finger) 1 hit; InterPro: IPR041306C5HCH IPR013083Znf_RING/FYVE/PHD; PANTHER: PTHR22884:SF312HISTONE-LYSINE N-METHYLTRANSFERASE, H3 LYSINE-36 SPECIFIC 1 hit PTHR22884SET DOMAIN PROTEINS 1 hit
The concept of genetic interaction is simple, but the physiological repercussions can be profound. Genes don't usually act individually. They are part of a complex system, the genome, where genetic interactions occur, leading to the effects of one gene or genetic variant being modified by the action of another genetic element or influenced by a third (39).
Identifying interactions between genes is an essential step in understanding the functioning of cells and tissues, as well as in understanding how many human diseases occur. This knowledge could also contribute to determining why the presence of variants that should trigger a hereditary disease does not always result in pathology or why the same variant can manifest differently in two different individuals. Furthermore, since genome analysis is increasingly being used in medicine, both in diagnosis and in treatment decision-making, identifying gene interactions is especially important for providing the best patient care(39).
The basic principles emerged, allowing researchers to predict a gene’s function and its relative importance for the cell’s health based on its position in the network. Studies also revealed the identity of so-called “modifier genes” which can suppress the effect of damaging mutations and how genetic background influences trait inheritance.
According to STITCH (‘Search Tool for Interacting Chemicals’), which integrates these disparate data sources for 430,000 chemicals into a single, easy-to-use resource. This gene-protein has various molecular interactions. STITCH V.5 is a database network view that gives the user the ability to view binding affinities of chemicals in the interaction network. This enables the user to get a quick overview of the potential effects of the chemical on its interaction partners. For each organism, STITCH provides a global network; however, not all proteins have the same pattern of spatial expression (Fig. 4).
In Illustration 4, stronger associations are represented by thicker lines. Protein-protein interactions are shown in grey, chemical-protein interactions in green, and interactions between chemicals in red. Network nodes represent proteins: splice isoforms or post-translational modifications are collapsed, i.e., each node represents all the proteins produced by a single, protein-coding gene locus. Small nodes indicate proteins of unknown 3D structure, while large nodes indicate that some 3D structure is known or predicted. Edges represent protein-protein associations: associations are meant to be specific and meaningful, i.e., proteins jointly contribute to a shared function; this does not necessarily mean they are physically binding each other. Edge Confidence: low confidence edge (0.150); high confidence edge (0.700); medium confidence edge (0.400); highest confidence edge (0.900).
This gene-protein has Predicted Functional Partners:
- S-adenosylmeth: S-adenosylmethionine; Physiologic methyl radical donor involved in enzymatic transmethylation reactions and present in all living organisms. It possesses anti-inflammatory activity and has been used in the treatment of chronic liver disease. Score: 0.981
- ZNF496: zinc finger protein 496; DNA-binding transcription factor that can act as both an activator and a repressor (587 aa). Score: 0.951
- PLOD1: procollagen-lysine, 2-oxoglutarate 5-dioxygenase 1; Forms hydroxylysine residues in -Xaa-Lys-Gly- sequences in collagens, essential for the stability of intermolecular collagen cross-links (727 aa). Score: 0.947
- PLOD2: procollagen-lysine, 2-oxoglutarate 5-dioxygenase 2; Forms hydroxylysine residues in -Xaa-Lys-Gly- sequences in collagens, essential for the stability of intermolecular collagen cross-links (758 aa). Score: 0.902
- AdoHcy: 5'-S-(3-Amino-3-carboxypropyl)-5'-thioadenosine. Formed from S-adenosylmethionine after transmethylation reactions. Score: 0.900
- hydrogen: In chemistry, a hydron is the general name for a cationic form of atomic hydrogen, represented with the symbol. The term "proton" refers to the cation of protium, the most common isotope of hydrogen. The term "hydron" includes cations of hydrogen regardless of their isotopic composition: thus it refers collectively to protons (1H+) for the protium isotope, deuterons (2H + or D+) for the deuterium isotope, and tritons (3H + or T+) for the tritium isotope. Unlike other ions, the hydron consists only of a bare atomic nucleus. Score: 0.900
- PLOD3: procollagen-lysine, 2-oxoglutarate 5-dioxygenase 3; Forms hydroxylysine residues in -Xaa-Lys-Gly- sequences in collagens, essential for the stability of intermolecular collagen cross-links (738 aa). Score: 0.900
- CDK9: cyclin-dependent kinase 9; Protein kinase involved in the regulation of transcription. Member of the cyclin-dependent kinase pair (CDK9/cyclin-T) complex, also called positive transcription elongation factor b (P-TEFb), which facilitates the transition from abortive to productive elongation by phosphorylating the C-terminal domain of RNA polymerase II. Score: 0.898
- AR: androgen receptor; Steroid hormone receptors are ligand-activated transcription factors that regulate eukaryotic gene expression and affect cellular proliferation and differentiation in target tissues. Transcription factor activity is modulated by bound coactivator and corepressor proteins. Transcription activation is down-regulated by NR0B2. Activated, but not phosphorylated, by HIPK3 and ZIPK/DAPK3. Score: 0.874
- BMP4: bone morphogenetic protein 4; Induces cartilage and bone formation. Also acts in mesoderm induction, tooth development, limb formation, and fracture repair. Acts in concert with PTHLH/PTHRP to stimulate ductal outgrowth during embryonic mammary development and to inhibit hair follicle induction. Score: 0.859
Understanding gene interactions holds the key to personalized medicine.
GeneMANIA is a program that identifies the most related genes to a query gene set using a guilt-by-association approach. The plugin uses a large database of functional interaction networks from multiple organisms, and each related gene is traceable to the source network used to make the prediction.
A search for interactions of the 6 actionable genes by GeneMANIA was conducted, which reported Networks: Co-expression: 100%: Showing 20 related genes, with 26 total genes, 0 attributes, and 33 total links (Fig. 5).
Taking the above into account, it is important to evaluate the patient's clinical condition, conduct reverse phenotyping, and search for other elements of multimodal diagnosis in pursuit of personalized, preventive, predictive, and precision medicine.