Differentially regulated proteins in Long COVID patients
A principal component analysis (PCA) on all quantitative protein data and all samples detected sufficient differences to segregate all LC patients from the HC group in principal component 3 (Fig. 1A). Using the average of all the peptide intensities of each protein for quantification, 162 of the 3131 proteins in the spectral library were identified as being differentially regulated in the LC samples (minimum 1.5-fold change (log2(FC) ≤ − 0.58 or ≥ 0.58) and a significance threshold of p ≤ 0.05 (− log10(p-value) ≥ 1.3). There were 79 proteins down-regulated (green) and 83 up-regulated (red), as illustrated in the volcano plot (Fig. 1B). A list of all the proteins that were significantly differentially regulated is found in Supplementary Table S1.
A STRING (v12) functional network analysis 38 on the 162 differentially regulated proteins highlighted enrichment of proteins associated with a number of Gene Ontology (GO) terms (see Table 1, and supplementary Table S2). Markov Cluster Algorithm (MCL) analysis 39 with an inflation parameter of 1.5 identified three major clusters containing 10 or more proteins within the full network (Fig. 2A and supplementary Table S3). One cluster (Fig. 2B) containing 36 proteins is broadly associated with immune response, and the other two with gene expression (spliceosome - Fig. 2C), and (transcription - Fig. 2D). These clusters have 15 proteins mostly linked to RNA splicing and 13 proteins mostly linked to RNA Polymerase II Transcription respectively. All the clusters are shown in supplementary Table S3.
The full network of the 162 differentially regulated proteins (Fig. 2A) comprised 16 protein nodes with first level connections to ten or more differentially regulated proteins. These 16 protein nodes and their numbers of first level interactions shown in brackets are B2M (16), BCL2 (20), CD4 (22), GFM1 (14), HLA-DRB1 (10), HNRNPM (10), LCK (12), NMP1 (16), NRAS (12), PLCG2 (10), PSMB9 (12), SNRPB (10), SNRPD1 (12), SYK (14), TLR2 (10) and UBQLN2 (11) (see Supplementary Table S1 for protein names). This demonstrates a high level of functional association and intra connectivity (betweenness centrality) among the regulated proteins. An enrichment analysis shown in Table 1 highlights a selection of the Reactome and KEGG pathways that suggest the correlations of subsets of these proteins in immune system functions. DAP12 is a DNAX-activating protein of 12 kDa that acts as a signaling adapter protein expressed in Natural Killer (NK) cells and myleoid cells participating in innate immune responses 40. Terms related to responses to SARS-COV-2, Epstein-Barr virus and HIV infections were prominent.
Table 1
Reactome and KEGG pathways most significantly represented by protein nodes with a high level of connectivity (10 or more first level edges to other nodes). Abbreviations: OGC - observed gene count, BGC - total background gene count in that category, FDR-false discovery rate. Strength measures the confidence score of interactions between proteins or genes. It quantifies the reliability of associations, reflecting the likelihood of true functional connections > 0.4 is considered significant.
Reactome ID | Term description | OGC | BGC | strength | FDR |
HSA-168256 | Immune System | 37 | 1979 | 0.36 | 4.50E-04 |
HSA-168249 | Innate Immune System | 27 | 1041 | 0.50 | 6.81E-05 |
HSA-1280215 | Cytokine Signaling in Immune system | 19 | 706 | 0.51 | 1.60E-03 |
HSA-2172127 | DAP12 interactions | 7 | 39 | 1.34 | 6.37E-05 |
HSA-5663205 | Infectious disease | 31 | 917 | 0.61 | 6.86E-08 |
HSA-9679506 | SARS-CoV Infections | 17 | 411 | 0.70 | 6.37E-05 |
KEGG ID | | | | | |
hsa04650 | Natural killer cell mediated cytotoxicity | 8 | 120 | 0.91 | 1.70E-03 |
hsa04664 | Fc epsilon RI signaling pathway | 6 | 65 | 1.05 | 2.80E-03 |
hsa05169 | Epstein-Barr virus infection | 9 | 192 | 0.76 | 3.50E-03 |
hsa05170 | Human immunodeficiency virus 1 infection | 12 | 203 | 0.86 | 6.93E-05 |
This Table 1 indicates that changes in immune system-related proteins feature significantly in the data from LC patients. The broad generalised Reactome (HSA-168256) category 'Immune System' was significantly enriched with 37 or 23% of the differentially regulated proteins (strength 0.36, FDR 0.00045). The more specific term 'cytokine signaling in immune system' (Reactome HSA-1280215) included 19 proteins (strength 0.51, FDR 0.001) listed here: CD4, BCL2, B2M, HLA-DRB1, LCK, NRAS, PSMB9, PLCG2 and SYK (all first level interactors for ≥ 10 other differentially regulated proteins), while TIMP1 (6), KPNA3 (4), NUP93 (4), PSMF1 (1), UBA3 (3), HLA-E (9), TRIM22 (2), MAPK9 (5), HLA-B (5) and IRF5 (4) also interact with multiple proteins (numbers of interacting proteins shown in brackets), giving weight to the association of immune system dysregulation in LC.
The immune system’s 37 differentially regulated proteins are listed here: AP2S1, ATP6V1G1, B2M, BCL2, CD300LF, CD4, DOK3, DYNC1I2, FGR, GMFG, HLA-B, HLA-DRB1, HLA-E, IRF5, KPNA3, LCK, MAPK9, MNDA, NDUFC2, NRAS, NUP93, PAFAH1B2, PLCG2, POLR2H, PSMB9, PSMF1, PTGES2, RAB6A, RAF1, RIPK1, SERPINB10, SYK, TIMP1, TLR2, TRIM22, TRIP12 and UBA3.
Interestingly, 17 proteins (out of a possible 411 in the category) in Table 1 are enriched from the Reactome (HSA-9679506) category 'SARS-CoV Infections' (Strength 0.7, FDR 6.37E-05); RIPK1, TLR2, MAN2A1, CSNK1A1, NPM1, SNRPD1, NUP93, SYK, HLA-E, RBBP7, HLA-B, SNRPB, CHMP2A, AP2S1, PTGES3, PLCG2 and B2M. The changed regulation of some of the proteins discussed above that are associated with ‘Immune System’ in general may have originated first in response to infection, in this case infection with the 'SARS-CoV-2 virus’, but persisted with the onset of LC.
To identify potential specific pathways involved in LC, a search was made for any GO terms where at least 25% of proteins from the total background gene counts (BGC) were differentially regulated in the LC dataset. More highly specialised GO terms describe specific molecular functions that involve fewer proteins i.e. these terms have lower BGC. All GO terms where there are ≥ 25% proteins (the observed gene count-OGC) of the BGC (differentially regulated), have been included along with the relevant protein symbols in Table 2.
Table 2
Functions associated with differentially regulated proteins related to specific immune cell functions. After GO term enrichment analysis GO terms where the OGC was ≥ 25% of the BGC were identified. Abbreviations: OGC - observed gene count, BGC - background gene count.
Category | Term ID | Term description | OGC | BGC | Protein symbols |
GO Process | GO:0002477 | Antigen processing and presentation of exogenous peptide antigen via MHC class Ib | 2 | 4 | HLA-E,B2M |
InterPro | IPR010579 | MHC class I, alpha chain, C-terminal | 2 | 4 | HLA-E,HLA-B |
COMPARTMENTS | GOCC:0043384 | pre-T cell receptor complex | 2 | 5 | CD4,LCK |
GO Process | GO:0042270 | Protection from NKC mediated cytotoxicity | 2 | 6 | HLA-E,HLA-B |
Reactome | HSA-9706374 | FLT3 signaling through SRC family kinases | 2 | 6 | SYK,LCK |
DISEASES | DOID:12894 | Sjogrens syndrome | 2 | 7 | HLA-DRB1,IRF5 |
Reactome | HSA-9637628 | Modulation by Mtb of host immune system | 2 | 7 | TLR2,B2M |
COMPARTMENTS | GOCC:0042612 | MHC class I protein complex | 3 | 8 | HLA-E,HLA-B,B2M |
DISEASES | DOID:3275 | Thymoma | 2 | 8 | CD4,CD5 |
GO Function | GO:0042609 | CD4 receptor binding | 2 | 8 | HLA-DRB1,LCK |
Comparison with an earlier study of ME/CFS patients
We reported a similar proteome study with a well characterized group of ME/CFS patients in 2020 compared with age/sex matched healthy controls 29. By contrast this patient cohort had been affected by ME/CFS on average for 16 years compared with each of the LC cohort for only 1 year. In this ME/CFS study, there were 346 differentially regulated proteins compared with the 162 proteins identified in the current LC study that met the same criteria (minimum 1.5-fold change and p-value ≤ 0.05) used for the selection in the LC study (Supplementary Table S4).
A MCL cluster analysis of regulated proteins detected in the previous study on ME/CFS patients, using the most recently updated STRING functional network analysis version 12 for consistency, revealed five clusters with 12 or more proteins (Fig. 3); two clusters were broadly associated with the immune system - antigen presentation and cytokine signalling (63 proteins - Fig. 3B) and immune system process - platelet activation, signalling and aggregation (20 proteins Fig. 3C), one with gene expression and metabolism - translation, RNA metabolism, protein metabolism and cellular response to stress (119 proteins-Figure 3D), and smaller clusters associated with the mitochondria - oxidative phosphorylation (13 proteins - Fig. 3E and vesicle-mediated transport (12 proteins - Fig. 3F).
Annotating each protein with functional information (gene ontology terms) allowed for enrichment analysis to identify both common and distinct functional categories or pathways that are enriched in the two datasets. A comparison of the STRING functional network analysis outputs for the LC data against the ME/CFS dataset (FC ≥ 1.5 with p ≤ 0.05) revealed that out of the Reactome and KEGG pathways, 22 and five GO terms were common to both LC and ME GO term enrichment analyses, respectively (Supplementary Table S4). Figure 4 shows the Reactome pathways and KEGG pathways that were common to the two datasets, with the categories listed in the box in each case.
An important outcome of the original ME/CFS study was the number of differentially regulated mitochondrial proteins involved in both general functions, metabolism, electron transport complexes, and the reactive oxygen species stress response. The MCL cluster analysis of the LC differentially regulated proteins in the current study identified a 6 protein cluster associated with the mitochondria (Cluster 7, Supplementary Table S3) that was small in contrast to the ME/CFS analysis but after the differentially regulated proteins from the LC data were searched against the Human MitoCarta3.0 41 database of human mitochondrial associated proteins, 21 were identified (Table 3).
Table 3
Differentially regulated mitochondrial proteins in the LC data. The human MitoCarta3.0 database was accessed to search for proteins associated with the mitochondria within the LC data set. Abbreviation: FC - fold change.
Peak Name | Symbol | Protein name | FC | p-value | Activity (MitoPathways) |
Mitochondrial Metabolism |
gi|373251164 | GLS | Glutaminase | 1.53 | 0.028 | Glutamate metabolism |
gi|937827788 | LDHB | L-lactate dihydrog. B chain | 1.57 | 0.011 | Glyoxylate metabolism |
gi|23618867 | SFXN1 | sideroflexin-1 | 1.70 | 0.002 | Serine and Vitamin metabolism |
gi|578829057 | PDPR | pyruvate dehydrogenase (PDH) phosphatase regulatory subunit | 2.85 | 0.013 | Pyruvate metabolism |
gi|767969704 | DLAT | dihydrolipoyllysine-residue acetyltransferase (PDH complex) | 1.64 | 0.042 | Pyruvate metabolism |
gi|32189392 | PRDX2 | peroxiredoxin-2 | 1.70 | 0.024 | ROS and GSH metabolism |
gi|8923001 | ABHD10 | mycophenolic acid acyl-glucuronide esterase | 1.60 | 0.035 | Xenobiotic metabolism |
gi|395394071 | TST | thiosulfate sulfurtransferase | 0.64 | 0.010 | Sulfur metabolism |
gi|13376617 | PTGES2 | prostaglandin E synthase 2 | 1.75 | 0.027 | Eicosanoid metabolism |
gi|15277342 | HSD17B8 | estradiol 17-beta-dehydrogenase 8 | 2.43 | 0.018 | Type II fatty acid Cholesterol, bile acid, steroid synthesis |
gi|37594464 | NUDT5 | ADP-sugar pyrophosphatase | 0.64 | 0.028 | Nucleotide synthesis and processing |
Mitochondrial Translation |
gi|38683855 | PTCD3 | pentatricopeptide repeat domain-containing protein 3, mitochondrial precursor | 1.84 | 0.029 | Mitochondrial ribosome; |
gi|8923421 | SARS2 | seryl-tRNA synthetase 2, mitochondrial | 2.07 | 0.008 | mt-tRNA synthetases |
gi|46852147 | IARS2 | isoleucyl-tRNA synthetase 2, mitochondrial precursor | 1.51 | 0.040 | mt-tRNA synthetases |
gi|815890954 | GFM1 | elongation factor G, mitochondrial | 0.45 | 0.044 | Translation factors |
Mitochondrial dynamics and surveillance |
gi|767999127 | BCL2 | apoptosis regulator Bcl-2 | 2.20 | 0.044 | Apoptosis |
gi|151108473 | FIS1 | mitochondrial fission 1 protein | 1.54 | 0.025 | Fission |
Oxidative Phosphorylation - Complex I |
gi|7661786 | NDUFAF4 | NADH dehydrogenase [ubiquinone] 1 alpha subcomplex assembly factor 4 | 1.97 | 0.008 | OXPHOS assembly factors |
gi|4758784 | NDUFC2 | NADH dehydrogenase [ubiquinone] 1 subunit C2 | 0.48 | 0.015 | OXPHOS subunits |
gi|7706351 | PTRH2 | peptidyl-tRNA hydrolase 2, mitochondrial | 0.46 | 0.001 | none |
gi|767902514 | CRYZ | quinone oxidoreductase | 1.56 | 0.010 | none |
This indicated the prominence of the disturbed mitochondrial functions seen in the original ME/CFS study was also seen in LC patients, though at a less well established state.