Demographic and Characteristics of the patients
After a strict pathological diagnosis and exclusion process, 168 patients with CHB, 173 patients with liver cirrhosis, 148 patients with HCC BCLC-0 stage, and 165 patients with HCC BCLC-A stage were prospectively collected in Beijing You’An hospital and included into this study (Fig. 1A). A total of 654 patients were randomly assigned to the training set (n = 442) and test set (n = 212). Demographic and clinical characteristics of patients were shown in Table 1.
Table 1
Clinical characteristics of the enrolled participants in this study (n = 654)
|
Training set (442)
|
Test set (212)
|
HBVLD (n = 230)
|
early HCC (n = 212)
|
HBVLD (n = 111)
|
early HCC (n = 101)
|
Age(year,mean ± SD)
|
45.37 ± 11.9
|
46.22 ± 11.7
|
45.38 ± 12.08
|
52.24 ± 10.36
|
Sex (M/F)
|
170/60
|
174/38
|
80/31
|
90/11
|
ALT (U/L)
< 40
≥ 40
|
108 (47.0%)
121(52.6%)
|
128 (60.4%)
83(39.2%)
|
51(45.9%)
60(54.1%)
|
60(59.4%)
41(40.6%)
|
AST (U/L)
< 40
≥ 40
|
125(54.3%)
104(45.2%)
|
140(66.0%)
71(30.9%)
|
59(53.2%)
52(46.8%)
|
55(54.5%)
46(44.5%)
|
Total bilirubin (µmol/L)
< 21
≥ 21
|
144(62.6%)
85(37.0%)
|
147(69.3%)
64 (30.2%)
|
79(71.1%)
32(18.9%)
|
63(62.4%)
38(37.6%)
|
Direct bilirubin (µmol/L)
< 7
≥ 7
|
169(73.5%)
60(26.1%)
|
175(82.5%)
36(17.0%)
|
88(79.3%)
23(20.7%)
|
72(71.3%)
29(28.7%)
|
Total protein (g/L)
< 65
≥ 65
|
168(73.4%)
61(26.5%)
|
135(59.0%)
76(35.8%)
|
85(76.6%)
26(23.4%)
|
65(64.4%)
36(35.6%)
|
Albumin (g/L)
< 40
≥ 40
|
136(59.1%)
93(40.4%)
|
105(49.5%)
106(50.0%)
|
72(64.9%)
39(35.1%)
|
45(44.6%)
56(55.4%)
|
γ-GT (U/L)
< 45
≥ 45
|
121(52.6%)
96(41.7%)
|
95(44.8%)
95(44.8%)
|
52(46.8%)
52(46.8%)
|
40(39.6%)
61(60.4%)
|
Alkaline phosphatase (U/L)
≤ 100
> 100
|
172(74.8%)
46(20.0%)
|
142(67.0%)
48(22.6%)
|
83(74.8%)
22(19.8%)
|
48(47.5%)
37(36.6%)
|
WBC count × 109/L
< 3.5
3.5–9.5
> 9.5
|
174(75.7%)
46(20.0%)
10(4.3%)
|
157(74.1%)
42(19.8%)
12 (5.7%)
|
79(71.2%)
24(21.6%)
8(7.2%)
|
74(73.3%)
14(13.9%)
13(12.9%)
|
Hemoglobin (g/L)
< 130
≥ 130
|
143(62.2%)
87(37.8%)
|
138(65.1%)
73(34.4%)
|
77(69.4%)
34(30.6%)
|
60(59.4%)
41(40.6%)
|
Platelet count × 109/L
< 125
≥ 125
|
129(56.1%)
101(43.9%)
|
108(50.9%)
104(49.1%)
|
70(63.1%)
41(36.9%)
|
58(57.4%)
43(42.6%)
|
Lymphocyte count×109/L
< 1.1
≥ 1.1
|
172(74.8%)
58(25.2%)
|
149(70.3%)
62(29.2%)
|
85(76.6%)
26(23.4%)
|
58(57.4%)
43(42.6%)
|
Monocyte count× 109/L
< 0.6
≥ 0.6
|
214(93.1%)
16(6.9%)
|
182(85.8%)
29(13.7%)
|
107(96.4%)
4(3.6%)
|
81(80.2%)
20(19.8%)
|
Neutrophil count× 109/L
< 1.8
1.8–6.3
> 6.3
|
166(72.2%)
53(23.0%)
11(4.8%)
|
167(78.8%)
17(8.0%)
17(8.0%)
|
80(72.1%)
26(23.4%)
5(4.5%)
|
71(70.3%)
12(11.9%)
18(17.8%)
|
Alpha-fetoprotein (ng/mL)
< 20
≥ 20
|
178(77.4%)
50(21.7%)
|
110(51.9%)
96(45.3%)
|
86(77.5%)
25(22.5%)
|
41(40.6%)
58(57.4%)
|
Abbreviations: HBVLD, HBV-related liver disease; CHB, chronic hepatitis B; LC, HBV-related liver cirrhosis; HCC, Hepatocellular carcinoma; BCLC, Barcelona clinic liver cancer staging system; γ-GT, γ-glutamyltranspeptidase; WBC, white blood cell. ALT, Alanine aminotransferase; AST, Aspartate aminotransferase. |
CpGs selection and panel signature building for early HCC diagnosis
On the basis of our previous studies [9], we portrayed differentially methylated CpGs between CHB and each of the HCC phases utilizing Limma package with Bonferroni-correction. The number of specifically methylated CpGs between CHB and each of the HCC phases included 2285 for BCLC-0; 2233 for BCLC-A; 3345 for BCLC-B; and 23596 for BCLC-C. There were 326 differentially methylated CpGs could specifically distinguish early HCC (BCLC-0 and BCLC-A stages) from CHB (Fig. 2A). Moreover, 20 robust CpG differentially methylated sites were used in this study from 33 CpGs (|delta beta |≥0.2). Meanwhile, 36 differentially methylated CpGs (|delta beta |≥0.2) could specifically distinguish early HCC from LC using paired t tests with Bonferroni-correction, and 17 CpGs were selected for this study (unpublished work). In all, methylation ration of 34 CpGs (three CpGs overlap) were investigated using MBS in the HBVLD and early HCC smaples (Supplementary Table 2 and Supplementary Table 3).
Thirty-four CpGs were reduced to six potential predictors using the LASSO regression model. The cross-validated error plot and a coefficient profile plot of the LASSO regression model were produced (Fig. 2B and 2C). The logistic regression was used for building the six-CpG-scorer: where risk score= -0.87-3.73×cg14171514 + 2.58×cg07721852 + 6.91×cg05166871-9.85× cg18087306 + 4.50×cg05213896 + 4.39×cg18772205. We then applied this formula to calculate the risk score for early HCC of each patient based on their individual six CpGs methylation ration. The risk score was significantly increased in the early HCC samples versus the HBVLD samples (p < 2.22− 16) (Fig. 2D). The six CpGs and their combination six-CpG-scorer also showed diagnostic accuracy (Fig. 2E, Supplementary Table 4). According to determining of maximum Youden index, 0 severed as the optimal cutoff point of risk score. Therefore we classified those patients with risk score < 0 as low-risk group, and those with risk score ≥ 0 as high-risk group. The distribution of demographic and clinical characteristics did not vary significantly between the high-risk and low-risk group (Supplementary Table 5).
Six-cpg-scorer Signature Was Independent Of Clinical Factors
The univariate logistic analysis was performed in ten thousand bootstrap datasets. Six of the 18 candidate variables indicated a higher early HCC risk; among these were higher age (≥ 43 years), male, individuals with higher AFP, higher six-CpG-scorer, lower TP (< 65g/L), and lower TBil (Table 2).
Four variables were selected by the backward feature selection procedure in ten thousand bootstrap datasets (Table 2). Age, sex, AFP and six-CpG-scorer signature were independent risk factors for early HCC. The six-CpG-scorer also showed significantly higher predictive accuracy than demographic and clinical risk factors.
Table 2. All 18 variables included in the backwards feature selection analysis.
|
|
Univariable
|
Multivariable
|
Variable
|
OR (95% CI)
|
P Value
|
OR (95% CI)
|
P Value
|
Age (years), ≥43 vs<43
|
5.07 (3.07-8.56)
|
1.75e-07 ***
|
5.23 (3.24-8.63)
|
2.56e-08 ***
|
Sex (Female vs Male)
|
0.32(0.17-0.56)
|
0.0011 **
|
0.32(0.19-0.53)
|
0.000248 ***
|
Alpha-fetoprotein (AFP) (ng/ml), log10 a
|
1.50 (1.35-1.69)
|
1.85e-09 ***
|
1.48(1.34-1.64)
|
3.05e-10 ***
|
Six-CpG-Scorer
|
2.37(1.83 -3.12)
|
1.10e-07 ***
|
2.58 (2.01-3.36)
|
1.08e-09 ***
|
Total protein(g/L), ≥65vs <65
|
0.38 (0.19-0.71)
|
0.01209 *
|
0.32(0.20-0.51)
|
0.56
|
Total bilirubin(μmol/L),
≥21 vs <21
|
0.32 (0.14-0.69)
|
0.01588 *
|
0.45(0.32-0.64)
|
0.78
|
AST ( U/L), log10
|
0.77 (0.45-1.28)
|
0.40551
|
|
|
ALT ( U/L) , log10
|
0.51(0.43-1.06)
|
0.15320
|
|
|
Direct bilirubin (μmol/L),
≥7 vs <7
|
0.55 (0.77-2.35)
|
0.38832
|
|
|
γ-GT (U/L), ≥45 vs <45
|
1.15 (0.83-1.61)
|
0.47649
|
|
|
Albumin (g/L), ≥40vs<40
|
0.74 (0.41-1.36)
|
0.42147
|
|
|
Alkaline phosphatase (U/L), log10
|
0.77 (0.44-1.35)
|
0.44623
|
|
|
Hemoglobin (g/L), ≥115vs<115
|
1.68 (0.91-3.14)
|
0.16827
|
|
|
White blood cell count × 109/L, log10
|
0.84 (0.17-4.24)
|
0.85430
|
|
|
Monocyte count× 109/L,
truncate_99 + log10 b
|
2.29 (1.14-4.67)
|
0.05410 .
|
|
|
Platelet count × 109/L,
≥125 vs<125
|
0.84(0.50-1.42)
|
0.58796
|
|
|
Lymphocyte count× 109/L ,
≥1.1 vs <1.1
|
0.82 (0.46-1.49)
|
0.58026
|
|
|
Neutrophil count× 109/L,log10
|
1.08(0.37-2.86)
|
0.89871
|
|
|
a: nonlinear transformation;
b: Monocyte countvariable was truncated with 1th and 99 th percentiles and then performed with nonlinear transformation;
Abbreviations: OR, odds ratio; γ-GT, γ-glutamyltranspeptidase; ALT, Alanine aminotransferase; AST, Aspartate aminotransferase. Signif. codes: ‘***’ 0.001, ‘**’ 0.01, ‘*’ 0.05.
|
Development Of An Individualized Early Hcc Diagnosis Nomgram
The early HCC diagnosis model incorporated the four risk predictors and estimated on the 20 complete datasets according to Rubin’s Rule. The prognostic index X (based on logistic regression model coefficients) was: X=-1.0944708-0.7183741×Sex (Male = 1, Female = 2) + 1.7286974×Age [(< 43 years) = 1, (≥ 43 years) = 2)] + 0.2761166×log(AFP) + 0.7902764× six-CpG-scorer. And the calculation of the predicted risk of early HCC from HBVLD:It presented as the early HCC nomogram (eHCC nomgram) (Fig. 3A).
Estimation for the C-Statistics and Brier score by bootstrap Validation
Internal validation was performed using the 500 times resampling enhanced bootstrap method from each 20 complete datasets. The result showed negligible model optimism. The apparent C-statistics and apparent Brier score was 0.805 and 0.200, respectively. The optimism of the C-statistics and Brier score was − 0.0042 and 0.00164, respectively. The adjusted C-statistics and Brier score was 0.809 and 0.199, respectively.
Diagnostic Performance And Clinical Usefulness Of Ehcc Nomgram
The AUROC of the model was 0.81 (95% CI, 0.77–0.85) in training set. The sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) when used in differentiating the early HCC from HBVLD were 70.0%, 77.8%, 74.4%, and 73.6%, respectively.
The calibration curve of the eHCC nomgram for the probability of early HCC demonstrated good agreement between prediction and observation in training set (Fig. 3B). The decision curve analysis of the eHCC nomgram and that for the model without six-CpG-scorer was presented in (Fig. 3C). The DCA showed that if the threshold probability of a patient or doctor was > 10%, using eHCC nomgram to predict early HCC adds more benefit than either the treat-all-patients scheme or the treat-none scheme (Fig. 3D).
Diagnostic Performance Of The Ehcc Nomogram In Test Set
We further enrolled 212 patients including 111 HBVLD and 101 early HCC to serve as test set for validation of the diagnostic potential. The risk score was significantly increased in the early HCC versus the HBVLD group in test set (p = 2.7×10− 7) (Fig. 4A). The eHCC nomgram achieved an AUROC value of 0.84 (95% CI 0.79–0.88) between the early HCC and HBVLD (Fig. 4B). The calibration curve demonstrated good agreement between prediction and observation in early HCC (Fig. 4C). The sensitivity, specificity, PPV, NPV were used in differentiating the early HCC from HBVLD were 68.9%, 82.9%, 80.0%, and 71.8%, respectively. The nomogram also indicated good clinical benefits both in DCA, which suggested an obvious diagnosis efficacy for early HCC from HBVLD (Fig. 4D).