Calculation of CIB
The basic principle of the analysis model is to comprehensively consider consistency in case control study and cohort study to determine the efficacy of a biomarker. The efficacy of a biomarker is normally described in terms of Yen, which is the sum of the positive rates of a biomarker in the disease group and the negative rates of this biomarker in the control group minus 1 as follows:
Yen = Pd-(Pc-1)-1 = Pd-Pc
where Pd and Pc represent the observed frequencies of a biomarker in the disease group and the control group, respectively, from the case-control study.
The consistency in cohort study (Crc) is the sum of the incidence in the exposure group (positive group of a biomarker) and the healthy rate in the non-exposure group (negative group of a biomarker) minus 1 as follows:
Crc = Pe-(Pn-1)-1 = Pe-Pn
where Pe and Pn represent the incidence in the exposed group and non-exposed group, respectively, from the cohort study.
We define the geometric mean of Yen and Crc as comprehensive index (CIB) as follows:
The geometric means are given because this mean tend to smaller numeric values. The range of CIB was (0~1), a larger CIB implied a stronger power of a biomarker.
Evaluation of ROC analysis
The receiver operating characteristic (ROC) analysis is the common method used to evaluate the effectiveness of diagnosis made using a biomarker [1,2,12]. In present study, ROC analysis was evaluated based on CIB whether the ROC analysis was still available or not.
A model comprising four sets of simulation data was established. Four sets of normally distributed random numbers (100 ± 20, n = 5000; 115 ± 20, n = 5000; 125 ± 20, n = 5000; 140 ± 20, n = 5000) were generated using the SPSS statistical software (IBM Corp., Armonk, NY, USA). Model A consisted of the datasets of 100 ± 20 and 115 ± 20; Model B consisted of the datasets of 100 ± 20 and 125 ± 20, and Model C consisted of the datasets of 100 ± 20 and 140 ± 20. The receiver operating characteristic (ROC) analysis was performed as shown in Figure 1.
Evaluation of sensitivity and specificity
Most studies that attempt to identify biomarkers use a case-control design rather than a cohort design. In case-control studies, the potential relationship between a biomarker and the disease is examined by comparing the frequencies of this biomarker in the diseased and non-diseased (control) groups. With the case-control approach, biomarkers are assessed in already diseased individuals, and the power of a biomarker is typically expressed as the positive rates of a biomarker in the disease group (referred to as sensitivity, Sen) and the negative rates of the biomarker in the control group (referred to as specificity, Spe) [4]. However, even for biomarkers with the same Youden index, the diagnostic power may be different. Further, it is unclear whether the Sen or Spe is more relevant with CIB for biomarkers with the same Youden index. If the cardinal number (value in the control group) is relatively small (and Spe is higher), CIB could change in spite of these biomarkers with the same Yen. Evaluation of Sen and Spe in the case-control study based on CIB values was performed using the values shown in Table 1.
Table 1
Evaluation of sensitivity (Sen) and specificity (Spe) in a case-control study based on comprehensive index of biomarker (CIB)
Higher Sen with lower Spe
|
Higher Spe with lower Sen
|
Sen
|
1-Spe
|
Yen
|
CIB
|
Sen
|
1-Spe
|
Yen
|
CIB
|
0.999
|
0.500
|
0.499
|
0.020
|
0.500
|
0.001
|
0.499
|
0.643
|
0.999
|
0.400
|
0.599
|
0.025
|
0.600
|
0.001
|
0.599
|
0.715
|
0.999
|
0.300
|
0.699
|
0.033
|
0.700
|
0.001
|
0.699
|
0.781
|
0.999
|
0.200
|
0.799
|
0.048
|
0.800
|
0.001
|
0.799
|
0.842
|
0.999
|
0.100
|
0.899
|
0.092
|
0.900
|
0.001
|
0.899
|
0.899
|
The incidence in the total population is considered as 1% for calculating CIB
|
Combination of two biomarkers based on CIB
Under ideal conditions, the power of a combination of two biomarkers would be better than the power of a single biomarker. Further, it is unclear whether biomarkers with the same CIB were combined, the combined power (CIB) would be similar or not. According to the above assumptions, we have chosen the simulated data analytical method to solves this problem.
We assume genetic markers with those expected under the hypothesis of panmixia (Hardy-Weinberg equation), and establish the simulated data (1 and 0 standing for positive and negative) on the SPSS platform according to random numbers; the frequencies of each group are generated by design, each group including 5000 cases (n=5000) and two items (genetic markers); the allele frequency of each item is same and the positive distribution is independent in one group.
Two simulated data groups are selected as disease group and control group depending on design, then CIB are calculated (m=1%). The joint action of multiple indices is evaluated with binary logistic regression[4] and a new CIB are calculated again.
RelationshipbetweenYen and CIB
Yen is the common index used to evaluate the effectiveness of a biomarker or diagnosis made using a biomarker. Further, it is necessary to know that relationship between Yen and CIB. Different Yen with the moderate cardinal number was generated using simulated data as shown in Table 2. The scatter diagram was plotted using the Yen as X-axis and CIB as Y-axis.
Table 2
Observation of relationship between Youden index (Yen) and comprehensive index of biomarker (CIB)
Sen
|
1-Spe
|
Yen
|
CIB
|
Sen
|
1-Spe
|
Yen
|
CIB
|
0.55
|
0.45
|
0.10
|
0.020
|
0.80
|
0.20
|
0.60
|
0.148
|
0.60
|
0.40
|
0.20
|
0.041
|
0.85
|
0.15
|
0.70
|
0.191
|
0.65
|
0.35
|
0.30
|
0.062
|
0.90
|
0.10
|
0.80
|
0.256
|
0.70
|
0.30
|
0.40
|
0.087
|
0.95
|
0.05
|
0.90
|
0.380
|
0.75
|
0.25
|
0.50
|
0.114
|
1.00
|
0.00
|
1.00
|
1.000
|
Sen: sensitivity; Spe: specificity; the incidence in the total population is considered as 1% for calculating CIB.
|