3.1. SWATH-MS and principal component analysis
The strategy of our SWATH analysis is shown in Fig. 1. Briefly, proteins were extracted from isolated PBMC and digested with trypsin. A third from each sample was added to a pooled sample, which was then subjected to peptide pre-fractionation using OG-IEF and analyzed by data-dependent acquisition MS to build a spectral library of 3990 proteins identified at a FDR ≤ 1% and a peptide confidence of ≥ 95%. The remainder of each individual sample was analyzed by SWATH-MS. The fragment ion intensities measured in the SWATH-MS analysis were aligned to the spectral library for identification and quantification. A total of 2970 proteins were quantified from triplicate sample measurements.
At the outset, a t-test comparison of the patient group vs the control group was carried out and proteins with significantly different relative abundances of P < 0.01 and Log10(Fold-Change) > 0.2 and < -0.2 identified (Tables 2 and 3). A total of 60 proteins were identified with different relative abundances between the eleven ME/CFS subjects and nine matched controls, with 38 showing increased relative abundance and 22 decreased relative abundance (Table 2 (increased) and Table 3 (decreased)).
Table 2
Proteins with increased relative abundance in ME/CFS compared to healthy controls (P < 0.01, Log10(Fold-Change) > 0.2)
GI Accession | Protein Name | Gene Name | P- value | Log10(Fold -Change) | Fold -Change |
148727341 | serine-threonine kinase receptor-associated protein | STRAP | 0.00019 | 0.21 | 1.61 |
22538467 | proteasome subunit beta type-4 | PSMB4 | 0.00021 | 0.25 | 1.77 |
24371248 | FUN14 domain-containing protein 2 | FUNDC2 | 0.00048 | 0.25 | 1.80 |
41281564 | WD repeat-containing protein 37 | WDR37 | 0.00049 | 0.23 | 1.69 |
145275202 | isoaspartyl peptidase/L-asparaginase | ASRGL1 | 0.00050 | 0.21 | 1.61 |
148539872 | acetyl-CoA acetyltransferase, cytosolic | ACAT2 | 0.00072 | 0.21 | 1.62 |
7705558 | inositol-3-phosphate synthase 1 | ISYNA1 | 0.0011 | 0.24 | 1.75 |
7019419 | nucleolar GTP-binding protein 2 | GNL2 | 0.0013 | 0.38 | 2.38 |
4503165 | cullin-3 | CUL3 | 0.0014 | 0.45 | 2.80 |
44680136 | D-beta-hydroxybutyrate dehydrogenase, mitochondrial | BDH1 | 0.0014 | 0.41 | 2.60 |
4502205 | ADP-ribosylation factor 4 | ARF4 | 0.0019 | 0.27 | 1.86 |
259155315 | mitochondrial 2-oxoglutarate/malate carrier protein | SLC25A11 | 0.0022 | 0.24 | 1.73 |
13129110 | methylosome protein 50 | WDR77 | 0.0023 | 0.24 | 1.75 |
4506923 | SH2 domain-containing protein 1A | SH2D1A | 0.0023 | 0.28 | 1.90 |
6912388 | grancalcin | GCA | 0.0024 | 0.25 | 1.79 |
115270970 | chloride channel CLIC-like protein 1 | CLCC1 | 0.0024 | 0.24 | 1.74 |
5031981 | 26S proteasome non-ATPase regulatory subunit 14 | PSMD14 | 0.0029 | 0.32 | 2.09 |
188219591 | nucleolysin TIA-1 isoform p40 | TIA1 | 0.0030 | 0.27 | 1.85 |
22035672 | thioredoxin reductase 2, mitochondrial | TXNRD2 | 0.0031 | 0.34 | 2.18 |
5454166 | vesicle transport through interaction with t-SNAREs homolog 1B | VTI1B | 0.0033 | 0.29 | 1.93 |
7657116 | glyceraldehyde-3-phosphate dehydrogenase, testis-specific | GAPDHS | 0.0033 | 0.35 | 2.23 |
225543288 | SUMO-activating enzyme subunit | SAE1 | 0.0035 | 0.25 | 1.80 |
47132595 | phosphate carrier protein, mitochondrial | SLC25A3 | 0.0036 | 0.32 | 2.09 |
28373194 | proteasomal ubiquitin receptor ADRM1 | ADRM1 | 0.0040 | 0.43 | 2.69 |
153251272 | calcineurin-like phosphoesterase domain -containing protein 1 | CPPED1 | 0.0041 | 0.21 | 1.61 |
4505023 | proteasome assembly chaperone 1 | PSMG1 | 0.0050 | 0.49 | 3.09 |
300360515 | actin-related protein 2/3 complex subunit 1A | ARPC1A | 0.0051 | 0.23 | 1.71 |
183396804 | regulation of nuclear pre-mRNA domain -containing protein 2 | RPRD2 | 0.0056 | 0.35 | 2.26 |
4885375 | histone H1.2 | H1-2 | 0.0058 | 0.25 | 1.79 |
48762926 | periodic tryptophan protein 2 homolog | PWP2 | 0.0064 | 0.36 | 2.30 |
4885373 | histone H1.1 | H1-1 | 0.0065 | 0.29 | 1.95 |
4885379 | histone H1.4 | H1-4 | 0.0068 | 0.24 | 1.74 |
5032087 | splicing factor 3A subunit 1 isoform 1 | SF3A1 | 0.0069 | 0.20 | 1.59 |
15487670 | nuclear RNA export factor 1 isoform 1 | NXF1 | 0.0073 | 0.21 | 1.61 |
4885377 | histone H1.3 | H1-3 | 0.0073 | 0.26 | 1.83 |
330340389 | up-regulated during skeletal muscle growth protein 5 | USMG5 | 0.0088 | 0.23 | 1.70 |
4506741 | 40S ribosomal protein S7 | RPS7 | 0.0090 | 0.23 | 1.69 |
4885381 | histone H1.5 | H1-5 | 0.0095 | 0.29 | 1.94 |
Table 3
Proteins with decreased relative abundance in ME/CFS compared to healthy controls (P < 0.01, Log10(Fold Change) < -0.2).
GI Accession | Protein Name | Gene Name | P- value | Log10(Fold Change) | Fold Change |
7706495 | dnaJ homolog subfamily B member 11 | DNAJB11 | 0.00032 | -0.24 | 0.57 |
4557367 | bleomycin hydrolase | BLMH | 0.00045 | -0.36 | 0.43 |
34101286 | zinc finger RNA-binding protein | ZFR | 0.00047 | -0.28 | 0.52 |
14149916 | src-like-adapter 2 | SLA2 | 0.00054 | -0.23 | 0.58 |
66933005 | Calnexin | CANX | 0.0012 | -0.24 | 0.58 |
6715607 | hemoglobin subunit gamma-2 | HBG2 | 0.0014 | -0.51 | 0.31 |
5031977 | nicotinamide phosphoribosyltransferase | NAMPT | 0.0014 | -0.31 | 0.49 |
8923541 | UPF0587 protein C1orf123 | C1orf123 | 0.0017 | -0.46 | 0.35 |
10863927 | peptidyl-prolyl cis-trans isomerase A | PPIA | 0.0026 | -0.26 | 0.54 |
143770741 | platelet glycoprotein VI | GP6 | 0.0028 | -0.21 | 0.61 |
19913385 | protein G6b | C6orf25 | 0.0038 | -0.42 | 0.38 |
4506085 | mitogen-activated protein kinase 13 | MAPK13 | 0.0039 | -0.27 | 0.54 |
4504073 | platelet glycoprotein Ib beta chain | GP1BB | 0.0045 | -0.26 | 0.55 |
12545406 | ras GTPase-activating protein 1 | RASA1 | 0.0053 | -0.20 | 0.63 |
4757774 | ADP-ribosylation factor-like protein 3 | ARL3 | 0.0067 | -0.24 | 0.58 |
410173533 | PREDICTED: uncharacterized protein LOC100996504 | ENSG000 263264 | 0.0070 | -0.27 | 0.54 |
29826335 | eukaryotic translation initiation factor 2 subunit 2 | EIF2S2 | 0.0080 | -0.25 | 0.56 |
30181236 | copine-2 | CPNE2 | 0.0087 | -0.31 | 0.49 |
190014603 | TBC1 domain family member 13 | TBC1D13 | 0.0087 | -0.26 | 0.55 |
4504077 | platelet glycoprotein IX | GP9 | 0.0089 | -0.31 | 0.49 |
157738645 | plexin-A4 | PLXNA4 | 0.0092 | -0.38 | 0.42 |
4504351 | hemoglobin subunit delta | HBD | 0.0093 | -0.33 | 0.47 |
The identified proteins were investigated by web-based applications STRING (http://string-db.org, version 11) and the Database for Annotation, Visualization and Integrated Discovery (DAVID) (https://david.ncifcrf.gov/, version 6.8). The increased abundance proteins have 32 functional associations (CI > 0.4) compared to an expected number of 13 random interactions, and a PPI enrichment P = 3.2 × 10− 6 (see supplementary table S1). The decreased relative abundance proteins have 6 functional associations (CI > 0.4) compared to an expected number of 1 random interaction, and a PPI-enrichment P = 0.0016 (see supplementary table S1). Supplementary tables S2 and S3 show the biological processes, molecular functions and cellular components (GO annotations) involving the proteins identified as changed in the ME/CFS group. KEGG and Reactome pathways, UniProt keywords and PFAM, INTERPRO and SMART protein domains implicated by the differently abundant proteins. In summary, the increased abundance proteins in ME/CFS function in histone methylation and DNA Damage/Telomere Stress Induced Senescence, proteasome assembly, NAD, NAD(P)-binding and mitochondrial substrate/solute transport, while decreased abundance proteins were primarily linked to wound healing, platelet activation and adhesion, blood coagulation and oxygen transport roles.
As the analysis was exploratory and, as has been shown in numerous studies of ME/CFS patient cohorts, it was likely there would be subgroups within the cohort, a principal component analysis (PCA) was used to further investigate the proteomes of the patient and control groups and to inform groupings. The identified principal components accounted for as much of the variability of the data as possible (PC1 38.5% of the data and PC2 15.9%), while having an orthogonal relationship. The PCA showed two possible subgroupings of study samples with similar proteome expression patterns, based primarily on their PC2 scores; one subgroup included nine ME/CFS patients (P1, P2, P3, P4, P5, P6, P7, P8, and P9), now called the ‘ME/CFS’ subgroup, while the second subgroup included all nine control participants and two ME/CFS subjects (P10 and P11) (Fig. 2).
The two different ME/CFS subgroups and control PCA clusters observed could not be explained by age, gender or BMI differences. Patients in our study might be separating by the differing severity of their symptoms, resulting in subtly different disease biology [9, 37]. Until our understanding of ME/CFS is more complete and a molecular diagnostic test is available, it is difficult to resolve better the concept of subgroups of ME/CFS.
To assess the ability of the principal components to predict patient or control, we fitted a logistic regression to the first three principal components. We then applied leave-one-out cross-validation. We found the model predicted the disease well, with 3/20 prediction errors (two false positives and one false negative). Figure 2B shows the predicted probabilities for all 20 subjects. We see that only P10, C3 and C5 were misclassified, on the expectation from the results of the PCA (Fig. 2A). A permutation test shows that the probability of achieving a prediction error of 3/20, or more extreme, is 0.04, so it is unlikely that the observed predictive capacity of the principal components has been caused by chance.
The nine ME/CFS patients that had separated from the controls were then separately compared with the nine controls. A student’s t-test between the two groups found now an expanded number of 99 proteins with significantly different abundance levels between this ‘ME/CFS’ group and ‘control’ group (P < 0.01, Log10 (Fold-Change) > 0.2 and < -0.2, increased proteins n = 47; decreased proteins n = 52). A number of identified proteins were involved in mitochondrial functioning, the TCA cycle, oxidative phosphorylation and redox signaling (see below in Table 4).
STRING analysis of the proteins increased (Fig. 3A) and decreased (Fig. 3B in the ME/CFS subgroup complemented the findings from the initial full ME/CFS set. Overall, the functional associations of the differently abundant proteins in the subgroup of ME/CFS patients increased significantly, and the majority of the functional pathways identified when comparing all eleven ME/CFS participants with the control group were further supported and augmented by comparison with this ME/CFS PCA subgroup. Of these, the majority were involved in stress-induced senescence and oxidative stress, mitochondrial functional pathways, immune and inflammatory pathways, and proteasome activation. Additional proteasome-related proteins and mitochondrial proteins involved in the electron transport chain and ATP synthesis emerged as being further enriched. The decreased protein data set was expanded to include MHC class I immune proteins and ribosomal proteins. The 47 increased proteins had 50 functional interactions (CI > 0.4) compared to the expected 23 random interactions, with a PPI enrichment P = 4.28 × 10− 7. Similarly, the 52 decreased proteins (with one protein excluded as the sequence mapped to no known protein/gene) had 22 functional interactions (CI > 0.4) compared to 11 expected random interactions, with a PPI enrichment P = 0.0015.
The identified mitochondrial proteins listed in Table 4 includes both increased and decreased relative abundance mitochondrial proteins with a lower P < 0.05 and Fold-Changes > 1.3 and < 0.75 (shaded). With these relatively less stringent P-value and Fold-Change cut-offs, there were a greater number of citric acid (TCA) cycle proteins and regulation of reactive oxygen species (ROS) proteins identified in the ME/CFS PCA group. We also tested the hypothesis that all identified mitochondrial proteins present among the 2970 proteins under investigation could alone specifically predict the disease with the above regression model (see Fig. 2B). Simulation showed that these 147 mitochondrion proteins were no better on average than a random sample of 147 other proteins.
Table 4
Differently abundant mitochondria-related proteins in the ‘ME/CFS’ PCA group compared to the ‘control’ group (P < 0.01, Log10(Fold-Change) > 0.2) but also including proteins (shaded) with P < 0.05 and Fold-Change > 1.3 and < 0.75.
Increased relative abundance proteins | P-value | Fold-Change |
FUN14 domain-containing protein 2 | 0.00023 | 1.91 |
NADH dehydrogenase [ubiquinone] 1 beta subcomplex subunit 3 | 0.00034 | 1.60 |
D-beta-hydroxybutyrate dehydrogenase | 0.00090 | 2.89 |
inositol-3-phosphate synthase 1 | 0.0025 | 1.82 |
mitochondrial 2-oxoglutarate/malate carrier protein | 0.0042 | 1.78 |
phosphate carrier protein | 0.0060 | 2.22 |
ATP synthase subunit epsilon | 0.0079 | 1.70 |
glyceraldehyde-3-phosphate dehydrogenase, testis-specific | 0.0081 | 2.31 |
thioredoxin domain-containing protein 17 | 0.0065 | 1.58 |
prohibitin | 0.0079 | 1.40 |
selenoprotein H | 0.013 | 1.62 |
endophilin-B1 | 0.014 | 1.46 |
thioredoxin reductase 2 | 0.014 | 2.18 |
up-regulated during skeletal muscle growth protein 5 | 0.014 | 1.79 |
glutamine-dependent NAD(+) synthetase | 0.014 | 1.53 |
poly [ADP-ribose] polymerase 4 | 0.015 | 1.65 |
NADH dehydrogenase [ubiquinone] 1 alpha subcomplex subunit 3 | 0.015 | 1.58 |
NADH dehydrogenase [ubiquinone] 1 beta subcomplex subunit 11 | 0.016 | 1.64 |
ATP synthase subunit delta | 0.017 | 1.78 |
poly [ADP-ribose] polymerase 14 | 0.018 | 1.78 |
single-stranded DNA-binding protein | 0.019 | 1.51 |
ubiquinone biosynthesis protein COQ9 | 0.020 | 1.77 |
NADH dehydrogenase [ubiquinone] 1 alpha subcomplex subunit 12 | 0.020 | 1.76 |
short-chain specific acyl-CoA dehydrogenase | 0.020 | 1.49 |
glycerol-3-phosphate dehydrogenase | 0.021 | 1.40 |
2-oxoglutarate dehydrogenase | 0.025 | 1.35 |
aconitate hydratase | 0.025 | 1.36 |
39S ribosomal protein L43 | 0.027 | 1.55 |
cytochrome c oxidase assembly factor 6 homolog | 0.028 | 1.47 |
biogenesis of lysosome-related organelles complex 1 subunit 1 | 0.030 | 2.04 |
pyruvate dehydrogenase E1 component subunit beta | 0.032 | 1.31 |
BRI3-binding protein | 0.033 | 2.27 |
trifunctional enzyme subunit beta | 0.034 | 1.40 |
ATP synthase subunit b | 0.035 | 1.49 |
putative transferase CAF17 | 0.036 | 1.30 |
glycine–tRNA ligase | 0.037 | 1.47 |
39S ribosomal protein L12, mitochondrial | 0.038 | 1.63 |
peptidyl-prolyl cis-trans isomerase NIMA-interacting 4 isoform 1 | 0.040 | 1.46 |
isocitrate dehydrogenase [NAD] subunit beta, mitochondrial isoform a precursor | 0.042 | 2.17 |
mitochondrial fission 1 protein | 0.047 | 1.48 |
Decreased relative abundance proteins | P-value | Fold-Change |
isocitrate dehydrogenase [NADP], mitochondrial precursor | 0.0001 | 0.59 |
peptidyl-prolyl cis-trans isomerase F, mitochondrial precursor | 0.0005 | 0.38 |
peroxiredoxin-6 | 0.0081 | 0.66 |
thiosulfate sulfurtransferase | 0.0092 | 0.65 |
choline transporter-like protein 1 | 0.011 | 0.36 |
39S ribosomal protein L23, mitochondrial | 0.012 | 0.45 |
mitochondrial import inner membrane translocase subunit Tim8 A isoform 1 | 0.012 | 0.47 |
cytochrome c | 0.013 | 0.56 |
ATP synthase mitochondrial F1 complex assembly factor 1 isoform 2 precursor | 0.013 | 0.60 |
ATP-binding cassette sub-family B member 7, mitochondrial isoform 1 | 0.018 | 0.55 |
dynamin-1-like protein isoform 1 | 0.019 | 0.68 |
stimulator of interferon genes protein | 0.036 | 0.61 |
oxidation resistance protein 1 isoform 4 | 0.037 | 0.69 |
NADH dehydrogenase [ubiquinone] 1 alpha subcomplex subunit 5 | 0.040 | 0.60 |
glycerol kinase isoform b | 0.047 | 0.70 |
CCA tRNA nucleotidyltransferase 1, mitochondrial | 0.048 | 0.57 |
Protein data sets were submitted to DAVID for functional annotation clustering. Analysis included (i) the significantly increased and decreased (P < 0.01, Log10(Fold-Change) > 0.2 and < -0.2) proteins identified from comparison of all eleven ME/CFS participants and the nine matched controls, and (ii) the second protein dataset generated from the PCA selected subgroup of nine ME/CFS subjects compared with controls, all with medium confidence (CI > 0.4 combined score from String functional association analysis) or higher interactions. Even with the small number of proteins identified with different relative abundance in the full set of ME/CFS subjects in comparison to controls (increased n = 38 and decreased n = 22, see Tables 2 and 3 respectively), interesting biological pathways were either enriched (6 functional clusters) or depleted (2 functional clusters) in the ME/CFS group (see supplementary Figures S1 and S2 - increased relative abundance proteins, and S3 -decreased relative abundance proteins). Supporting the STRING analysis, histone methylation and regulation of gene expression was indicated to be an enriched biological implication of the increased protein data set. Also noted, WD repeat containing proteins were enriched, along with NAD and NAD(P)-binding Rossman-like fold proteins, proteins involved in proteasome poly-ubiquination of proteins and the MAPK cascade, inner membrane proteins of the mitochondrion, and proteins of the Golgi apparatus. Proteins decreased in the ME/CFS group were involved in blood coagulation and platelet activation and protein folding.
The network of proteins with medium or higher confidence interactions (CI > 0.4 combined score, Supplementary Table 1) of differentially abundant proteins (n = 99, P < 0.01, Log10(Fold-Change) > 0.2 and < -0.2) after t-test comparison of the ME/CFS and control PCA groups were also analysed by DAVID functional association clustering. The 47 increased proteins clustered into several key categories, specifically; histone DNA binding and nucleosome assembly (enrichment score = 3.25); proteasome function, polyubiquination and the TNF-mediated signalling pathway (enrichment score = 2.16), WD repeat domains (enrichment score = 1.59) the mitochondrion and inner mitochondrion membrane (enrichment score = 1.24), and NAD and NAD(P)-binding domains (enrichment score = 1.22). The 52 decreased proteins also clustered into functional groups including antigen processing and presentation (enrichment score = 2.37); tubulin and the cytoskeleton (enrichment score = 2.25); endoplasmic reticulum (enrichment score = 2.03); and cell-cell adhesion (enrichment score = 0.73).