Precursor-independent MS/MS-based N-glycome profiling of 20 mouse tissues
To generate a consistent N-glycome dataset, we collected tissues, as well as serum, in duplicates from age-matched C57BL/6J mice, and processed all samples using identical protein extraction, enzymatic N-glycan release (i.e. PNGase F), chemical reduction and glycan clean-up protocols. All samples were analyzed using porous graphitic carbon liquid-chromatography (PGC-LC) coupled to a high-resolution tandem (MS/MS) mass-spectrometer (i.e. Orbitrap Exploris 480).
To survey this data, we first assessed the MS/MS data for the occurrence of diagnostic, N-glycan derived fragment ions, independent of intact glycan precursor mass information. For this, we automatically extracted all MS/MS spectra and retained only those that contained N-glycan specific fragment ions (i.e. 224.1 amu, diagnostic for reduced N-acetylhexosamine). This first breakdown of our dataset confirmed that, overall, more than 40 percent of all MS/MS spectra generated in this study (i.e. 220,506 out of 509,283 MS/MS spectra) contained information on chemically reduced glycan precursor ions. Importantly, however, we also found considerable differences in the relative proportion of N-glycan-derived MS/MS spectra between tissues, ranging from approx. 20% for ileum to more than 80% for seminal vesicle (Supplementary Fig. 1.A). This tissue-dependent variability in the dynamic range and structural complexity prompted further investigation into tissue-specific N-glycome features such as sialylation or fucosylation, independent of intact glycan precursor information.
Sialylation is essential to mammalian development31 and results from the attachment of either N-acetyl-neuraminic acid (Neu5Ac) or N-glycolyl-neuraminic acid (Neu5Gc). To dissect this important compositional heterogeneity of N-glycans, we extended our MS/MS data-filtering criteria to sialylation-specific diagnostic fragment ions (i.e. Neu5Ac fragment ion mass 292.1 amu, Neu5Gc fragment ion mass 308.1 amu).This revealed substantial differences in the number of Neu5Ac- and/or Neu5Gc-containing spectra across tissues, ranging from approximately 80% in serum, lung, and heart to only about 18% in the seminal vesicle (Fig. 1., Supplementary data).. Most tissues exhibited comparable levels for both sialic acid variants (SupplementaryFig.1.B) (mean ratio Neu5Gc:Neu5Ac = 1.5), except for serum (Neu5Gc:Neu5Ac = 11.2), colon (Neu5Gc:Neu5Ac = 0.3), and brain (Neu5Gc:Neu5Ac = 0.1), which showed notably different ratios. This data corroborated previous reports on the murine brain N-sialome being vastly dominated by Neu5Ac and to only contain trace amounts of Neu5Gc28,32,33. Overall, heart, kidney, liver, lung, mammary gland, serum, skin, spleen, testis, and thymus showed higher numbers of sialic acid-containing spectra compared to the mean value across all tissues, while bladder, brain, colon, duodenum, ileum, jejunum, lymph node, seminal vesicle, pancreas, and white adipose tissue exhibited lower numbers. (Supplementary Fig. 2.A, 2.B, and 2.C).
Similar to sialylation, fucosylation is a vital, developmentally controlled modification of N-glycans, which has been implicated in numerous cell-cell interactions34 and massively expands the structural heterogeneity of the N-glycome. Fucose can be linked to the most proximal core-GlcNAc residue, originally connected to the protein (i.e. core-fucose). Additionally, fucose residues have been found linked to either galactoses or N-acetylhexosamines in multiple positions of the distal parts of N-glycan antenna (i.e. distal fucosylation). Interestingly, distal fucosylation also encodes a series of essential, immune-reactive glycan-epitope isomers, comprising one (e.g. Lewis X, sialyl Lewis X, Lewis A, sialyl Lewis A, blood-group H type 1 and type 2) or more fucose residues (e.g. Lewis Y, Lewis B), in humans. Fucosyltransferase 3 (FUT3), the enzyme which catalyzes the addition of the Lewis A and Lewis B epitope in humans was found to be a pseudogene in mice. The murine system is thus believed to lack these two important fuco-epitopes35,36. Unfortunately, most of these fucosylated N-glycan isomers cannot be resolved by MS/MS alone 37–39. As a consequence, our precursor independent MS/MS profiling approach, which is based on fragment ions that are indicative37–39 of fucosylated N-glycans (i.e. one fucose linked to one HexNAc and to one hexose; fragment ion mass = 512.2 amu), merely reflects on the combined expression patterns of core fucose, Lewis X, sialyl Lewis X, Lewis Y, blood-group H (bgH) type 1 and type 2, but not Lewis A or Lewis B, in murine tissues. Our analysis showed that fucosylated N-glycans were indeed present in all tissues, with exceptionally high levels in seminal vesical (85%)40, kidney (65%), and brain (50%) (Fig. 1., Supplementary Fig. 2.E). This suggested generally high expression levels of fucosyltransferases in these tissues, such as Fut9, which is highly expressed in kidney and brain41, Fut2 and Fut4, which are highly expressed in colon epithelial cells, or Fut8, which is globally expressed in mouse42.
Another distinctive structural feature of N-glycans is bisecting N-acetylglucosamine (GlcNAc). Bisecting GlcNAc residues are β1,4-linked to the core β-mannose residue by MGAT3 and have been reported vital to fetal development43, immunity, and cell adhesion44. Expression of this critical structural modification of N-glycans was profiled across all tissues, based on the diagnostic fragment ion of mass 792.3 amu (i.e. one reduced GlcNAc-residue, two GlcNAc-residues, and one hexose residue). As expected, we found bisecting GlcNAc expression levels to be highly tissue specific, with the highest expression levels in brain (4%), kidney (2.5%), and colon (2%) (Fig. 1., Supplementary Fig. 2.D), all in good agreement with moue Mgat3 gene expression data45.
Moreover, an extensive screening of all tissues was conducted to identify fragment ions indicative of the Sda antigen. This antigen is characterized by a Neu5Ac residue α2,3-linked and a GalNAc residue β1,4-linked to the galactose within a LacNAc motif (860.3 amu)46. Intriguingly, this specific structural modification was predominantly expressed in the colon (9%), followed by the jejunum (3.5%) and duodenum (1%) (Supplementary Fig. 1.C). Notably, the ileum exhibited only minimal expression, accounting for less than 0.25% of all N-glycan associated MS/MS spectra. In previous studies, the Sda antigen has been identified in the colon of healthy humans, on the Tamm-Horsfall glycoprotein of Sda+ individuals, and in the serum of patients with gastric cancer46. In humans, the biosynthesis of this antigen is driven by the B4GALNT2 gene, which orthologs’ (B4Galnt2) expression indeed appears to be restricted to the intestinal organs in mice42.
Automated MS/MS-data-driven reconstruction of the mouse N-glycome.
To determine the N-glycan precursors contributing to the observed N-glycome signatures, we implemented a data-aggregation workflow to reconstruct N-glycosylation patterns at the precursor level. This workflow efficiently reduced PGC-LC-MS data complexity leveraged MS/MS information for precursor identification and extracts quantitative precursor information. The extraction of precursor mass information from LC-MS/MS data is often complicated by the imperfection of mass-spectrometric data, due to e.g. incorrect mono-isotopic peak-picking or charge-state assignments by the mass-spectrometer, unintended precursor ion co-isolation, in-source fragmentation, or excessive adduct-ion formation. To address these challenges, we conducted post-analysis charge-deconvolution and deisotoping of all MS spectra (using DeCon247, Supplemental Information), assigned precursor mass and intensity values to mass bins (range of 1000–4000 Da; bin-width = ± 0.05 amu), and summed the values across the entire chromatographic time-range using custom code. This transformation of time-resolved LC-MS data into two-dimensional precursor mass-to-intensity arrays not only improved the robustness of mono-isotopic precursor mass assignments, but also efficiently reduced the dimensionality of our dataset. Additionally, these data arrays facilitated the construction of tissue-specific quantitative histograms (Fig. 2.A, Supplementary data), resembling MALDI-TOF MS spectra frequently communicated in the fields of glycomics, providing a convenient data visualization format that integrates seamlessly with current glycome data repositories (e.g. CFG, www.functionalglycomics.org; Fig. 2.A).
In a next step, raw MS/MS data were refined (i.e. mono-isotopic peak picking and charge state assignment re-evaluation) and converted into the generic .mgf file-format using the proteomics software PEAKS48. From this, to identify MS/MS spectra that derived from N-glycan precursors and to stringently control for unintended precursor ion co-isolation events, we calculated spectrum-specific Score-values (i.e. SNOG-score) from the intensity of an N- (and O-) Glycan-specific fragment ion (i.e. oxonium ion of the reduced-end monosaccharide GlcNAc; 224.1 amu), using custom code (Supplemental data). Empirically, MS/MS spectra with SNOG-scores greater than 0.03 were determined to derive from actual N-glycan precursors (Supplementary Fig. 3., Supplementary Fig. 4., and Supplemantary Fig. 5.). Subsequently, precursor mass information of SNOG-scored MS/MS spectra was aligned with the initial two-dimensional MS precursor mass-to-intensity arrays, and only MS signals of true N-glycan precursors above a cumulative intensity threshold of 5E + 6 were retained.
The number of automatically identified glycan precursor masses greatly varied between tissues. For example, while SNOG-filtering reduced the total number of glycan-derived precursor mass-bins by approx. 30% in brain, it removed approx. 75% of all precursor mass-bins in liver (Supplementary Fig. 4.B). This, again, highlighted important tissue-dependent differences in the dynamic range and the structural complexity of N-glycomes, and suggested sample-specific background signals at the precursor level. Manual inspection of MS/MS spectra of low-scoring mass-bins (i.e. rejected) confirmed, for example, exceptionally high levels of non-N-glycan signals that derived from hexose-oligomers (e.g. dextrans or maltodextrins degradation products of glycogen49) in liver, which were efficiently removed by our approach (Supplementary Fig. 5.).
The correlation and cluster analysis of our SNOG-filtered dataset (Fig. 3.) validated robust reproducibility among sample replicates and revealed distinctive tissue-specific clustering patterns. Notably, we found clustering even among distantly related mouse tissues, such as the exocrine organs, i.e. seminal vesicles and pancreas, clustering with brain, or the central organs of the immune system, spleen and thymus, coalescing with mammary glands, for which we observed convergent N-glycan signatures. Intriguingly, while spleen and thymus share a cluster, lymph nodes exhibit a glycosylation pattern more akin to white adipose tissue, potentially influenced by spatial/histological proximity.
A distinctive glycosylation pattern set apart the tissues of the small intestine, particularly the jejunum and duodenum, constituting a distinct cluster separate from the ileum and colon. Conversely, the ileum and colon formed a cluster with the bladder and skin, suggesting a potential association rooted in shared epithelial glycosylation patterns.Additionally, our observations extend to the highly blood-perfused organs, where the liver and lung share a cluster with serum, while heart, kidney, and testis form another distinct cluster. This clustering pattern provides insights into the glycosylation variations within these organ groups, potentially reflecting their functional relationships or physiological roles.
Dissecting the mouse N-glycome based on sub-structural determinants.
To systematically query and stratify N-glycan precursors based on fragment ion data, in a next step we extended our initial SNOG-scoring scheme by additional N-glycan specific fragment ions (i.e. eSNOG). MS/MS spectrum-specific eSNOG-scores were thus calculated from the relative intensities of sub-structure specific diagnostic fragment ions and empirically determined, sub-structure specific eSNOG cut-off values were used for down-stream data-filtering (Supplementary Fig. 6.). Efficiently limiting the impact of potential gas-phase rearrangements50, this versatile data-filtering approach allowed for automated stratification of our reconstructed N-glycome data based on sub-structure specific features.
At first, we used eSNOG values to automatically stratify the N-glycome of mouse kidney (Fig. 4.) based on signals that contained: (I) distal-fucosylated, (II) Neu5Gc-sialylated, (III) NeuAc5-sialylated, (IV) alpha-galactosylated, (V) oligomannosidic N-glycans, and (VI) compositions which do not fall into any of these categories (i.e. predominantly undecorated or unusually decorated structures). In total, our eSNOG approach automatically uncovered 227 fucose-, 180 Neu5Gc-, 207 Neu5Ac-, 31 alpha-Gal-containing N-glycan precursor mass-bins, as well as 82 N-glycan precursors that do not fall into any of the six categories. When accumulating the MS signal intensities of all N-glycan precursors that fell into these six different categories, distal fucosylated N-glycans made up ~ 34% of the total ion current (TIC), Neu5Gc-sialylated mass-bins ~ 30%, Neu5Ac-sialylated mass-bins ~ 23%, alpha-Gal containing mass-bins ~ 4%, and oligomannose-associated mass-bins ~ 8%. Glycan compositions that did not conform to any of these categories (i.e. undecorated structures), represented ~ 16% of the TIC in kidney.
Manual inspection of the stratified kidney N-glycome dataset (Fig. 4.) confirmed that our approach accurately classified N-glycan precursor mass-bins. For example, based on their precursor masses, biantennary, core-fucosylated and di-sialylated N-glycans were correctly assigned to either contain only Neu5Ac (2371.9 amu, Hex5HexNAc4Fuc1Neu5Ac2), only Neu5Gc (2403.9 amu, Hex5HexNAc4Fuc1Neu5Gc2), or one Neu5Ac as well as one Neu5Gc-residue (2387.9 amu, Hex5HexNAc4Fuc1Neu5Ac1Neu5Gc1) (Fig. 4.A). Similarly, the mass-bins of 2257.8 amu and 2113.8 amu were correctly assigned to represent Neu5Gc sialylated (Hex5HexNAc4Neu5Gc2) and alpha-galactosylated (Hex7HexNAc4Fuc1) N-glycan precursor compositions, respectively. Furthermore, the mass-bin of 2039.7 amu was classified to contain Neu5Ac, which allowed us to accurately define its composition as Hex6HexNAc3Fuc1Neu5Ac1, corresponding to a Neu5Ac-sialylated hybrid-type N-glycan. The Neu5Gc-capped analogue of the same glycan at 2055.7 amu (i.e. Hex6HexNAc3Fuc1Neu5Gc1) was correctly classified as Neu5Gc-containing N-glycan. The most abundant fucosylated N-glycan precursor in mouse kidney corresponded to a presumably bisected, bi-antennary N-glycan with three fucose residues (2284.9 amu) (Fig. 4.A). Other fucosylated precursor masses corresponded to tri- or tetra-antennary extensions of this N-glycan composition, with or without bisecting GlcNAc and with 3 to 5 fucose residues (e.g. Hex6HexNAc5Fuc3/4, Hex6HexNAc6Fuc3/4, Hex7HexNAc6Fuc3/4/5, Hex7HexNAc7Fuc3/4/5), confirming previous studies51,52. Abundant N-glycan precursor masses that were classified as “undecorated”, included the mass of 1789.7 amu, which corresponded to the composition Hex5HexNAc4Fuc1, as well as smaller, truncated N-glycans, including the compositions Hex3HexNAc2Fuc1 and Hex3HexNAc3Fuc1 (Fig. 4.A), which are indeed not modified by distal fucose or sialic acid residues.
Capitalizing on our automated data-analysis workflows, we next stratified all other N-glycome datasets into the same six N-glycan categories and compared their relative abundances across the 20 murine tissues analyzed (Fig. 4.B). This comparative analysis allowed us to further dissect the structural complexity of the mouse N-glycome at the precursor level and revealed a remarkable diversity in glycosylation patterns across tissues.
From our initial precursor-independent profiling analyses we found that serum was largely dominated by Neu5Gc-bearing N-glycans, representing up to 96% of the TIC53. Additionally, our N-glycan precursor-informed data now revealed that this important trait of the murine serum N-glycome was essentially derived from only two biantennary, non-fucosylated N-glycans (Hex5HexNAc4Neu5Gc1 and Hex5HexNAc4Neu5Gc2), totaling ~ 57% of all its Neu5Gc-containing structures. The sialylated fraction of the brain N-glycome, on the other hand, consisted almost entirely of Neu5Ac-decorated N-glycan species, in good agreement with previous studies25,54. Based on our eSNOG-stratified precursor information, the N-sialome of the brain presented as highly diverse, comprising truncated (e.g. Hex4HexNAc3Fuc1Neu5Ac1), hybrid-type (e.g. Hex6HexNAc3Fuc1Neu5Ac1), bisecting GlcNAc-containing (e.g. Hex5HexNAc5Fuc1Neu5Ac2) and highly complex tetra-antennary N-glycans (e.g. Hex7HexNAc6Fuc3Neu5Ac2). Additionally, we found approx. 50% of the sialylated N-glycan species in brain to also carry distal fucoses. From this data we calculated that almost 50% of the brain N-glycome is sialylated. Of note, the degree of sialylation reported for the brain varies vastly between studies, with previous estimates ranging from only ~ 3%25, over 20%29, or up to ~ 40% of sialylation55, depending on methodology used.
Next to brain, kidney and seminal vesicle stood out by their comparably high abundance of distal fucosylated N-glycans. Different from brain and kidney, however, seminal vesicles showed extremely low levels of sialylation, with only ~ 3% of the total N-glycome being either Neu5Ac- or Neu5Gc-sialylated. Instead, the seminal vesicle N-glycome was dominated by distal fucosylated (~ 75%) N-glycans that were at the same time alpha-galactosylated (~ 45%). The most abundant precursor compositions of this type corresponded to a series of tri-antennary, core-fucosylated N-glycans with permutational additions of alpha-galactose and/or distal fucose (i.e. Hex6HexNAc5Fuc4 – Hex7HexNAc5Fuc3 – Hex8HexNAc5Fuc2), suggesting that fucosylation and alpha-galactosylation on the same antenna are mutually exclusive. Similarly high levels of alpha-galactosylated N-glycans were only found in the pancreatic N-glycome (~ 47%), which also presented with rather low levels of sialylation (approx. 10%) and distal fucosylation (approx. 15%). In pancreas, the most abundant alpha-galactosylated N-glycans ranged from biantennary, with or without core-fucose, with one or two alpha-galactosylated or fucosylated antennae (e.g. Hex6HexNAc4Fuc1, Hex7HexNAc4, Hex6HexNAc4Fuc2, Hex7HexNAc4Fuc1), to tri- and tetra-antennary antennary N-glycans with alpha-galactosylated and/or fucosylated antenna (e.g. Hex7HexNAc5Fuc2, Hex8HexNAc5Fuc1, Hex7HexNAc5Fuc3, Hex8HexNAc5Fuc2, Hex9HexNAc5Fuc1, Hex10HexNAc6Fuc2, Hex11HexNAc6Fuc1). The exceptionally high levels of alpha-Gal in the two exocrine organs, seminal vesicle and pancreas, were not observed in previous mouse studies.
Differentiating the N-glycomes of seminal vesicle and pancreas, we next screened our data for fragment ions that are diagnostic for doubly fucosylated antennae (i.e. Hex1HexNAc1Fuc2, 658.3 amu), hence the Lewis Y-epitope. Our analysis revealed that this N-glycan motif was most specific for seminal vesicle (~ 1.2%) (Supplementary Fig. 7.A) and barely detected in the pancreas of mice. Lewis Y-containing precursors included a series of partially core-fucosylated, bi- (e.g. Hex5HexNAc4Fuc4, Hex5HexNAc4Fuc5,) and multi-antennary N-glycans (e.g. Hex6HexNAc5Fuc5, and Hex8HexNAc7Fuc6). This unique glycosylation landscape of the seminal vesicle stands out among all tissues, and it is, to our knowledge, the first glycobiological description of this organ in mammals. Furthermore, the high expression levels of distally fucosylated N-glycans, including those with the Lewis Y epitope, correlated with gene expression data of the respective fucosyltransferases (i.e. Fut2, Fut4), as well as previous reports on the glycome of human seminal plasma 40,56. The functional implications of these exceptionally high levels of distal fucose and alpha-galactose in exocrine organs remain to be explored.
Deep mining the mouse N-glycome uncovers unusual structural modifications.
To comprehensively annotate the results of our non-targeted data-analysis approach we used a in silico constructed mouse N-glycome database, holding an extensive list of canonical mouse N-glycans (i.e. glycoDB; 1429 unique N-glycan compositions, 960 precursor ion mass bins; Supplementary). Surprisingly, the automated annotation of our N-glycome histograms with this glycoDB highlighted several precursor masses, which, despite their consistent and reproducible detection, could not be explained by known, conventionally computed or anticipated N-glycan compositions. We hypothesized that this remarkably large population of unknown N-glycan precursor masses (i.e. approx. 15% of TIC across all samples; ranging from approx. 6% TIC in ileum, to up to approx. 37% in liver; Supplementary Fig. 4.C and 4.D) would either result from experimental noise in the MS raw-data or represent unusual N-glycan compositions. Manual inspection of underlying MS/MS spectra uncovered and confirmed a series of unusual diagnostic fragment ions that were indicative for rare or non-canonical N-glycan structures, such as those bearing the HNK-1 epitope, sulfated HexNAc residues, doubly sialylated antenna, fucosylated LacdiNAc structures, or acetylated sialic acids (i.e. Ac-Neu5Ac and Ac-Neu5Gc), not covered by our in-silico N-glycome model database. Systematic mining of our dataset across all tissues using our eSNOG approach revealed striking tissue-specificity for most of these non-canonical N-glycan modifications.
HNK-1 is a unique trisaccharide epitope that consists of glucuronic acid linked to galactose, which is further linked to a GlcNAc residue. Additionally, HNK-1 has been reported to exist in a sulfated and a non-sulfated form (i.e. HSO3-3GlcA-beta1,3-Gal-beta-1,4-GlcNAc or GlcA-beta-1,3-Gal-beta-1,4-GlcNAc)57,58. This usually unregarded modification was previously identified in brain28,57 and kidney58, mainly in biantennary complex type N-glycan structures. The attachment of glucuronic acid is known to be catalyzed by two, highly tissue-specific, homologous enzymes, namely GlcAT-P (B3gat1) in brain and GlcAT-S (B3gat2) in kidney. Intriguingly, while GlcAT-P activity is suppressed by bisecting GlcNAc residues, the activity of GlcAT-S is not affected59. Mining the entire mouse N-glycome for diagnostic fragments that relate to HNK-1, either sulfated or non-sulfated (i.e. SO4-HexA-Hex-HexNAc; HexA-Hex-HexNAc), confirmed that HNK-1 was indeed only expressed in kidney (~ 1.5%)58 and brain (~ 0.5%) (Fig. 5.A). The sulfated form of HNK-1, however, was exclusively found in the brain. In kidney, the most abundant HNK-1 N-glycan precursor corresponded to a triply fucosylated, tri-antennary structure, carrying a bisecting GlcNAc, with the composition Hex6HexNAc6Fuc3HexA1. Other HNK-1 containing N-glycans of this tissue corresponded to tri- and tetra-antennary structures with 1–3 antennary fucose-residues (e.g. Hex6HexNAc6Fuc4HexA1, Hex7HexNAc7Fuc2HexA1, Hex7HexNAc7Fuc3HexA1, Hex7HexNAc7Fuc4HexA1). It is noteworthy that, based on its composition, and corroborated by the underlying MS/MS spectra, we found that the Hex6HexNAc6Fuc4HexA1 N-glycan carried a fucosylated HNK-1 antenna (Supplementary Fig. 8.). To the best of our knowledge, this fucosylated form of HNK-1 has never been found in a natural source and was only artificially synthesized in a previous study60.
Sulfation is considered the most diverse glycan modification with 35 sulfotransferases involved in the process of glycan sulfation61. Although many of these sulfotransferases are thought to serve in decorating O-linked glycosaminoglycan chains, sulfated N-glycans have also been reported for porcine and human pancreas, with the highest levels found within the islets of Langerhans62. Systematic MS/MS-based screening of the different tissues for the relevant fragment ion (HexNAc-SO4–284 amu) revealed that almost all tissues exhibited at least low levels of N-glycan sulfation. In line with previous reports32,62, also in our data the highest amount of HexNAc sulfation was found in pancreas (~ 5%), bladder (~ 2.5%), skin (~ 2%), and brain (~ 1%) (Fig. 5.F). Remarkably, however, the population of sulfated N-glycans in the pancreas was comprised exclusively of bi-antennary glycans, with and without core-fucose (e.g. Hex5HexNAc4Fuc1SO41). Other sulfated structures in the pancreas were found to also carry one or two additional alpha-Gal residues (e.g. Hex6HexNAc4Fuc1SO41), which aligned well with the high expression levels of alpha-Gal in the pancreas found by our study.
The expression of the fucosylated LacdiNAc N-glycan epitope was previously associated with different forms of cancer63, and previous data has shown that LacdiNAc precursors can be fucosylated by FUT964, making it a candidate gene for regulating the tissue-specific biosynthesis of fucosylated LacdiNAc. Recently, fucosylated LacdiNAc containing structures have been identified in the N-glycome of the human brain27 and of HEK-293 cell-line65,66. Systematic screening of all tissues for glycan precursors that generated the respective diagnostic fragment ion (HexNAc2Fuc1 – 553.2 amu) revealed that this epitope is expressed in a highly tissue-specific manner, with the highest levels found in kidney (~ 3%), colon (~ 2%), skin (~ 0.5%), and brain (~ 0.1%) (Fig. 5.E). These observations were further corroborated by previous reports of Fut9 expression in the proximal tubule of kidney, in intestinal epithelial cell and in neurons42.
Neu5Ac and Neu5Gc can both be installed in either α2,3-, α2,6- or α2,8-linkage, to either galactose or -very rarely in N-glycans- to N-acetyl-hexosamines, such as N-acetyl-glucosamine (i.e. 6-sialyl Lewis C; as reported previously in small amounts in bovine fetuin67) or N-acetyl-galactosamine (e.g. sialyl LacDiNAc). We thus compared the relative abundances of the two sialic acid variants linked to hexoses, the canonical acceptor sites on N-glycans, or to N-acetylhexosamines (i.e. N-acetylglucosamine or N-acetylgalactosamine; 6-sialyl Lewis C, sialyl LacdiNAc, respectively), across all tissues. Again, we eSNOG-filtered all N-glycan derived MS/MS spectra for those that contained fragment ions diagnostic for sialic acids (i.e. fragment ion mass 292.1 amu and 308.1 amu) and for sialylated HexNAc (i.e. fragment ion mass 495.2 amu and 511.2 amu, respectively). As expected, the relative proportions of precursor signals of canonically sialylated N-glycans greatly exceeded those that were generated from the unusual sialyl-HexNAc structures, in all tissues. Also, the relative incorporation rates of Neu5Ac and Neu5Gc were essentially independent of the acceptor monosaccharide across all tissues. Remarkably, in brain Neu5Ac-HexNAc was found on approximately 10% of the sialylated N-glycan fraction.
We then systematically screened all tissues for fragment ions that are diagnostic for the doubly Neu5Ac-capped di-sialyl Lewis C epitope (i.e. 948.3 amu), which has previously been reported for brain32,55, and its compositional analog, the Neu5Gc-decorated di-sialyl Lewis C epitope (i.e. 980.3 amu). Biosynthesis of di-sialyl Lewis C has been associated with the expression of the sialyltransferases ST6GALNAC3, ST6GALNAC5 and ST6GALNAC6, which are all able to catalyze the addition of α2,6 sialic acid to the GlcNAc residue within LacNAc motifs68,69. Interestingly, high expression level of ST6GALNAC5, which in healthy mouse and human is only found in the brain70,71, has been strongly linked to brain-tropism of breast cancer metastasis70. This suggests that di-sialyl Lewis C may be involved in brain microenvironmental interactions facilitating metastatic niche establishment. Our cross-tissue analysis revealed that the expression of Neu5Ac-containing di-sialyl Lewis C is strictly limited to a small number of tissues, with the most prominent being brain (~ 3% of the TIC) and heart (~ 0.7%) (Fig. 5.B). Only minor levels (< 0.2%) of di-sialyl Lewis C N-glycans were found in kidney, testis, and lung. The associated precursor masses in the brain correspond to compositions ranging from potential hybrid-type (e.g. Hex5HexNAc3Fuc1Neu5Ac2, Hex6HexNAc3Fuc1Neu5Ac2) over biantennary complex-type N-glycans with up to four Neu5Ac-residues (e.g. Hex5HexNAc4Fuc1Neu5Ac4), to triantennary N-glycans with four Neu5Ac-residues (e.g. Hex6HexNAc6Fuc1Neu5Ac4). Neu5Gc-decorated variants of the di-sialyl Lewis C-epitope were detected, albeit at low levels, in multiple tissues. The highest expression levels were found in liver (ranging from ~ 2% to ~ 0.5%), heart (~ 1%), testis (ranging from ~ 2% to ~ 0.8%), lung (~ 0.8%), kidney (~ 0.4%), spleen (~ 0.2%), thymus (~ 0.4%), and serum (ranging from ~ 1% to ~ 0.1%) (Fig. 5.D). Interestingly, the Neu5Gc-carrying di-sialyl Lewis C N-glycans exhibited a diminished structural complexity when compared to the Neu5Ac-capped variants found in the brain (Fig. 5.). In contrast to their Neu5Ac-capped homologs, N-glycan compositions containing Neu5Gc-based di-sialyl Lewis C were almost identical across all relevant tissues, comprising essentially two biantennary N-glycans (i.e. Hex5HexNAc4Neu5Gc3 and Hex5HexNAc4Fuc1Neu5Gc3).
Further dissecting the structural complexity of sialylated N-glycans, we also investigated the relative abundance of O-acetylated neuraminic acids across tissues. O-Acetylation of sialic acids is correlated with the circulatory half-life of glycoproteins in the human serum and can be crucial for their biological activities72. Most importantly, O-acetylated sialic acids are critical entry receptors to many respiratory viruses, including Influenza C virus, human coronavirus OC43 and the murine coronavirus73. Murine coronaviruses often spread to the liver, an organ topism that has been suggested to be partially explained by the expression pattern of Ac-Neu5Ac and Ac-Neu5Gc in both lung and liver74. Quantifying the signals of all N-glycan precursors with O-acetylated neuraminic acids (i.e. Ac-Neu5Ac, fragment ion mass = 334.1 amu and/or Ac-Neu5Gc, fragment ion mass = 350.1 amu) allowed us to confirm expression of these important modifications almost exclusively in five tissues, namely lung (~ 4.5%), heart (~ 2 %), kdney (~ 1.5%), and, at very low levels, spleen (< 0.1%) (Fig. 5.C). Liver gave ambiguous results, as liver 1 showed very high, and liver 2 lower expression of acetylated sialic acids (~ 3.5% and ~ 0.2%, respectively). The respective compositions in the lung are all partially core-fucosylated, bi-antennary N-glycans with one, two, or three Neu5Gc- or Neu5Ac-residues, of which one sialic acid residue was acetylated (e.g. Hex5HexNAc4Fuc1Neu5Gc1Ac1). So far, CASD1 is the only mammalian enzyme known to catalyze the acetylation of sialic acid resulting in the formation of Neu5,9Ac275. In mouse, like human, CASD1 appears to be widely expressed among most organs42. Our findings suggest additional unknown regulatory mechanisms that restrict expression of O-acetylated neuraminic acids to specific organs. Furthermore, we predominantly observed O-acetylation on Neu5Gc and only to a low degree on Neu5Ac residues (Supplementary Fig. 7.H and Supplementary Fig. 7.I). This suggests important differences in the biosynthesis, stability, or incorporation of the two O-acetylated sialic acid variants into N-glycans. As it is unclear whether CASD1 also catalyzes the O-acetylation of Neu5Gc, this warrants further investigations into the substrate specificities of CASD1 and the identification of additional O-acetyltransferases76.
Profiling the isomeric structural complexity of the murine N-glycome.
Next to compositional variations, N-glycans exhibit a unique level of micro-heterogeneity that derives from structural, positional, and anomeric isomers. The number of unique N-glycan structures that can be constructed from a given mono-saccharide composition (hence of identical molecular mass, i.e. isobaric) adds a critical layer of complexity to the N-glycome9. Importantly, the co-existence of multiple isobaric N-glycan isomers within a single tissue cannot be captured by MS (or even MS/MS) alone and can only be resolved by either chromatographic or ion-mobility based separation techniques. To generate isomer-sensitive information, in this study, all samples were analyzed using a highly isomer-selective stationary phase (PGC-LC). Previously established N-glycan retention libraries combined with MS/MS data were used to identify the exact structures of respective isomers9,26,27,77.
Expanding our N-glycome analyses by the integration of chromatographic information revealed the staggering structural complexity of the mouse N-glycome and provided insight into vital, organ-intrinsic regulations of glycobiological pathways. To showcase the granularity of our N-glycan isomer-sensitive dataset, we first compared the retention times of a specific, single precursor mass, that holds all doubly Neu5Ac-sialylated, core-fucosylated, biantennary, complex type N-glycan structures of composition Hex5HexNAc4Fuc1Neu5Ac2, across tissues (Fig. 6.A). As sialic acids are usually found in either α2,3- or α2,6-linkage to terminal galactose residues of N-glycans, up to four different structural isomers of unique retention times are expected for this single glycan composition: both antennae carrying α2,3-linked, both antennae capped with α2,6-linked, and one of the antennae with α2,6-linked while the other antenna bearing a α2,3-linked Neu5Ac. To compensate for experimental chromatographic elution-time shifts between samples, all retention times within a given analytical run were normalized to those of the consistently detected Man5 N-glycan structure. Elution profiles of N-structures comprised of Hex5HexNAc4Fuc1Neu5Ac2 showed that almost all tissues were dominated by α2,3-linked Neu5Ac (Fig. 6.A, Supplementary Fig. 9.D). Interestingly, however, brain, lung, and testis presented with a balanced ratio of α2,3- and α2,6-linked Neu5Ac isomers. Moreover, the brain displayed a notable presence of distinct N-glycan structure isomers due to the occurrence of branching Neu5Ac and/or antennary fucose. Remarkably, these specific glycan structures were exclusive to the brain tissue and were not identified in any other organ within the mouse.
Next, we compared the elution profile of the corresponding Neu5Gc-sialylated N-glycan structures (i.e. Hex5HexNAc4Fuc1Neu5Gc2) across all tissues. In stark contrast to Neu5Ac-sialylated structures, Neu5Gc-sialylated structures were predominantly found in α2,6-linkage across tissues, except for skin (i.e. ratio of α2,3- to α2,6-linked Neu5Gc approx 50%) (Fig. 6.A, Supplementary Fig. 9.E). This entirely different construction of the Neu5Gc N-sialome compared to its Neu5Ac-terminated counterpart raised the question of the origin of these differently sialylated structures.
The elution profile of the mixed Neu5Ac/Neu5Gc composition (Hex5HexNAc4Fuc1Neu5Ac1Neu5Gc1), closely resembled the elution profile of the Neu5Ac/Neu5Ac-sialylated structures in all tissues (Fig. 6.A), suggesting a shared origin for these structures. By contrast, the structural profile of entirely Neu5Gc-sialylated structures (i.e. Hex5HexNAc4Fuc1Neu5Gc2) markedly deviated from the Neu5Ac/Neu5Gc and Neu5Ac/Neu5Ac patterns. Notably, the elution profile of Neu5Gc/Neu5Gc structures exhibited minimal variation across tissues and closely resembled the serum elution profile, suggesting that these structures were actually derived from (contaminant) serum glycoproteins.
PGC-LC also allowed us to discriminate pivotal positional differences in distal fucosylation (i.e. α1,2-linked to galactose or α1,3-linked to GlcNAc-residues) that give rise to the important, glycan-associated immune-determinants bgH and Lewis X. While Lewis X determinants are mainly synthesized by FUT4 or FUT9, bgH-epitopes are synthesized by FUT1 or FUT278. To discern these critical fucose structures, we compared the retention time profiles of a multiply fucosylated precursor mass of the compositions Hex5HexNAc4Fuc3 across all relevant samples (Fig. 6.B). The N-glycan structures that may be deduced from this single composition comprise asialo, core-fucosylated, biantennary, complex type N-structures with two distal fucoses, either in bgH- or Lewis X-related linkage. Based on our normalized elution profile data we found unexpected differences in antennary fucose between brain, kidney, seminal vesicles, and pancreas (Fig. 6.B). While brain and kidney were found to contain almost exclusively the Lewis X containing isomer79, pancreas exhibited only the bgH-epitope containing isomer, which correlates with the high expression of FUT1 in the pancreas80. Seminal vesicle, however, exhibited signals corresponding to both the Lewis X and the bgH-eptitope containing isomer. This finding aligns well with the occurrence of the Lewis Y-epitope, as described above, and the high expression level of FUT1, FUT2 and FUT4 in seminal vesicle40. Interestingly, a major carrier of Lewis X and Lewis Y in human seminal plasma has been linked to Glycodelin isoform S (GdS)81. As Lewis X and Y are known ligands to the immune receptor DC-SIGN82 and glycodelins are potent immunosuppressors, they have been suggested to be important for feto-embryonic defense83 against adverse immune reactions in the early stages of gestation.
Chromatography also provided stratification of arm/branch-isomers of tri-antennary, complex-type structures. Complex-type N-glycans are characterized by the substitution of both (i.e. α1,3- and α1,6-) terminal core-mannoses by the addition of GlcNAc residues by Mgat1 and Mgat2, respectively. Additional branching of such complex-type structures results from secondary GlcNAc substitutions on the (i.e. α1,3- and α1,6-) terminal core-mannoses by Mgat4 and Mgat5, respectively, or the installation of a “bisecting GlcNAc” to the core-central mannose by Mgat3. Our isomer-sensitive analysis thus allowed us to discern between the catalytic activities of the five different glycosyl-transferases involved across all tissues. To explore the occurrence of biologically distinct N-glycan structures of identical composition, we compared the retention times of the precursor composition Hex3HexNAc5Fuc1 across all tissues which could either be multiple tri-antennary structural isomers or a single biantennary, bisected complex type structure (Fig. 6.C, Supplementary Fig. 9.C). The respective elution profiles indicated three dominant peaks. Intriguingly, brain, kidney, colon, ileum, duodenum, jejunum, and serum were strongly dominated by biantennary, bisected structure isomers. While this confirmed the results of our MS/MS based profiling, it also suggests a high occurrence of bisecting GlcNAc in brain, kidney, and colon, corroborated by Mgat3 expression data84.
Apart from oligomannose-type, N-glycans are often classified into hybrid-type or complex-type structures. Hybrid-type N-glycans result from the incomplete action of alpha-mannosidase II, giving rise to unsubstituted mannose residues on the α1,6-arm and a potentially substituted GlcNAc residue linked to the α1,3-mannose residue1. Interestingly, many N-glycome studies infer N-glycan classification based on composition. We found that this approach is highly oversimplifying the experimental data presented here. For example, the composition Hex5HexNAc3Fuc1, which is often considered an “archetypical” hybrid type N-glycan structure, separated into several chromatographic peaks across tissues (Supplementary Fig. 9.A). Although most tissues exhibited highly similar elution profiles, pancreas, seminal vesicle, thymus, and spleen showed distinct dominant structures, which eluted much later. Manual inspection of the associated MS/MS spectra indicated core-fucosylated, truncated N-glycan structure with an alpha-galactosylated antenna. This was further corroborated by high levels of alpha-galactose in pancreas, seminal vesicle, thymus, and spleen, as were observed in our initial MS/MS based N-glycome profiling, and implied elevated levels of hexosaminidase in these tissues.
The classification of N-glycan structures, solely relying on composition, is further complicated by the incorporation of bisecting GlcNAc into hybrid-type N-glycans. Many hybrid-type structures with bisecting GlcNAc have counterparts in the class of complex-type N-glycans, adding an additional layer of complexity to their delineation based on composition (e.g. Hex5HexNAc4Fuc1, Hex6HexNAc4Fuc2, Hex5HexNAc4, Hex6HexNAc4). While the classification of the composition Hex5HexNAc4Fuc1 as complex-type was true for most tissues we analyzed, we found that brain exhibited at least two hybrid-type structures within this composition9 (Fig. 6.D, Supplementary Fig. 9.A). Similarly, while the composition Hex6HexNAc4Fuc1 consisted of two hybrid-type structures containing bisecting GlcNAc in the brain, it represented alpha-galactosylated complex-type structures when present in other tissues (Supplementary Fig. 10.A). The same was found for the composition Hex6HexNAc4Fuc2, which consists of hybrid-type structures with a bisecting GlcNAc in the brain, yet complex-type structures when present in other tissues (Supplementary Fig. 10.B). These findings suggest that hybrid-type structures with bisecting GlcNAc are yet another highly distinctive feature of the brain N-glycome, and they boldly underscore the merits of isomer-sensitive N-glycome analyses to uncover unexpected structural signatures.