There is a considerable interest in developing disease-related biomarkers and applying them to improve patient care. Despite major investments in this area by granting agencies and commercial organizations, the yield has been disappointingly poor. The handful of cancer biomarkers that are used in the clinic today were discovered more than 40 years ago (9).
Previously, we and others commented on several reasons that contribute to newly discovered biomarker failures and identified pre-analytical, analytical, and post-analytical shortcomings which may affect a biomarker’s performance (10–17). In short, most of the failures are due to unrecognized biases/differences between the diseased and controlled clinical samples, the groups and numbers of patients and their clinical information, the analytical method used, and more recently, the way the data are interpreted (black box approach) (17, 18).
Recently, considerable efforts were made, to find ways to better reproduce published and seemingly promising biomarkers. It has been realized that a large number of manuscripts, published in even top-rated journals, describe false discovery (14). The number of retractions of manuscripts published in highly reputable journals is at the all-time high (19, 20).
One of the ways to decrease false discovery is to reproduce the findings, preferably independently, from the original investigators. This is not an easy task, since specific reagents and techniques may not be available to the validators. Also, this process is time-consuming and expensive. We proposed a simpler way to tackle the irreproducibility problem that we coined “the 5-year reflection” (21). Reproducing at least a fraction of high impact studies may reveal weaknesses which can lead to improved outcomes in the future.
In the paper under discussion, Shen et al. tried to identify new biomarkers for glioma by collecting blood samples from the peripheral circulation and from glioma arteries and veins in the vicinity of the tumor. Tissue samples were also collected. To validate their discovery data, they collected samples that were not used in the discovery phase.
In our previous work, we described the identification of glioma biomarkers by a different proteomic technique, the PEA (6, 8). Taking these results, we then examined which of the proteins identified by Shen et al. and showed biomarker promise, were in the PEA panel. Surprisingly, we found no overlap between the candidates in the two studies. Even the best putative biomarker selected by Shen et al. (SERPINA6) showed no differences between gliomas and meningiomas in our patients. Additional analyses have shown that SERPINA6 concentration was no different between the tumoral and peritumoral tissue. To explain some of these discrepancies, we suggest that stricter quantitative technologies are likely needed to identify differences between glioma and non-glioma patients in the discovery phase, with emphasis to be given to clinical utility.
Regarding the diagnostic power of SERPINA6, the sensitivity and specificity of the test was about 88%. This led the authors to conclude that it may be suitable for clinical use. However, even these seemingly high sensitivities and specificities are likely not enough for clinical use, as we exemplified elsewhere (2–4). Moreover, for a biomarker to be promising it needs to have clinical utility that complements statistical significance. A clinically useful biomarker needs to have either very high specificity + good sensitivity or very high sensitivity + good specificity. Simply put, having a significant difference in median/mean value between groups, does not confer clinical utility.
We conclude that the biomarkers identified by Shen et al. (1) were not successfully validated with our own sets of data from glioma and meningioma patients and the characteristics of the test, as published, is not sufficient to warrant any clinical applications at present.