Baseline clinical characteristics
In the present study, 346 eligible patients were analyzed in the training cohort, and 173 patients were included in the validation cohort. The median follow-up was 51.4 months (interquartile range [IQR]:42.1-67.0 months) for the training cohort and 50.4 months (IQR: 41.9-66.0 months) for the validation cohort. In the training cohort, the 1-, 3-, and 5-year OS rates were 97.4%, 83.8%, and 48.3% respectively. In the validation cohort, the 1-, 3-, and 5-year OS rates were 94.2%, 84.4%, and 42.8%, respectively.
The optimal cut-off value for each continuous variable was as follows: age (60 years), smoking index (20.0), BMI (26.33 kg/m2), WBC (4.3 × 109/L), neutrophils (7.0 × 109/L), lymphocyte (1.41 × 109/L), monocyte (0.4 × 109/L), platelet (293.0 × 109/L), hemoglobin (130.0 g/L), neutrophil-to-lymphocyte ratio (3.91), derived neutrophil-to-lymphocyte ratio (2.46), lymphocyte-to-monocyte ratio (3.4), platelet-to-lymphocyte ratio (208.89), systemic immune-inflammation index (1141.96), total protein (77.2 g/L), albumin (42.4 g/L), globulin (33.1 g/L), albumin-to-globulin ratio (1.36), CRP (5.47 mg/L), CRP-to-albumin ratio (0.16), apo A (1.28 g/L), apo B (1.03 g/L), apo A–to–apo B ratio (0.96), LDH (167.5 U/L), HDL (1.16 U/L), cystatin C (0.94 mg/L), advanced lung cancer inflammation index (262.33), and prognostic nutritional index (47.35). Patients’ clinical characteristics and blood biomarkers for the patients were listed in Table 1. There was no significant difference in the distribution of clinical characteristics and blood-biomarkers between training cohort and validation cohort.
Table 1
Demographics and clinical characteristics of patients in the training and validation cohort
Characteristic
|
Training cohort
|
|
Validation cohort
|
χ2 value
|
P value
|
n=(346)
|
|
n=(173)
|
|
No. (%)
|
|
No. (%)
|
|
|
Gender
|
|
|
|
2.435
|
0.119
|
Male
|
264 (76.3%)
|
|
121 (69.9%)
|
|
|
Female
|
82 (23.7%)
|
|
52 (30.1%)
|
|
|
Age (years)
|
|
|
|
0.956
|
0.328
|
≤ 60
|
310 (89.6%)
|
|
150 (86.7%)
|
|
|
> 60
|
36 (10.4%)
|
|
23 (13.3%)
|
|
|
Family history
|
|
|
|
0.079
|
0.778
|
Yes
|
90 (26.0%)
|
|
47 (27.2%)
|
|
|
No
|
256 (74.0%)
|
|
126 (72.8%)
|
|
|
Smoking indexa
|
|
|
|
1.661
|
0.198
|
≤ 20.0
|
226 (65.3%)
|
|
103 (59.5%)
|
|
|
> 20.0
|
120 (34.7%)
|
|
70 (40.5%)
|
|
|
BMI (kg/m2)
|
|
|
|
1.250
|
0.264
|
≤ 26.33
|
298 (86.1%)
|
|
155 (89.6%)
|
|
|
> 26.33
|
48 (13.9%)
|
|
18 (10.4%)
|
|
|
TNM stageb
|
|
|
|
1.965
|
0.580
|
I
|
12 (3.5%)
|
|
5 (2.9%)
|
|
|
II
|
45 (13.0%)
|
|
24 (13.9%)
|
|
|
III
|
172 (49.7%)
|
|
76 (43.9%)
|
|
|
IV
|
117 (33.8%)
|
|
68 (39.3%)
|
|
|
Treatment
|
|
|
|
0.242
|
0.623
|
Rad
|
58 (16.8%)
|
|
32 (18.5%)
|
|
|
Rad and Che
|
288 (83.2%)
|
|
141 (81.5%)
|
|
|
WBC (10 9/L)
|
|
|
|
0.007
|
0.933
|
≤ 4.3
|
57 (16.5%)
|
|
29 (16.8%)
|
|
|
> 4.3
|
289 (83.5%)
|
|
144 (83.2%)
|
|
|
Neutrophils (10 9/L)
|
|
|
|
0.879
|
0.348
|
≤ 7.0
|
306 (88.4%)
|
|
148 (85.5%)
|
|
|
> 7.0
|
40 (11.6%)
|
|
25 (14.5%)
|
|
|
Lymphocyte (10 9/L)
|
|
|
|
0.099
|
0.753
|
≤ 1.41
|
145 (41.9%)
|
|
75 (43.4%)
|
|
|
> 1.41
|
201 (58.1%)
|
|
98 (56.6%)
|
|
|
Monocyte (10 9/L)
|
|
|
|
0.466
|
0.495
|
≤ 0.4
|
175 (50.6%)
|
|
82 (47.4%)
|
|
|
> 0.4
|
171 (49.4%)
|
|
91 (52.6%)
|
|
|
Platelet (10 9/L)
|
|
|
|
|
|
≤ 293.0
|
298 (86.1%)
|
|
154 (89.0%)
|
0.857
|
0.355
|
> 293.0
|
48 (13.9%)
|
|
19 (11.0%)
|
|
|
HGB (g/L)
|
|
|
|
1.130
|
0.288
|
≤ 130.0
|
106 (30.6%)
|
|
61 (35.3%)
|
|
|
> 130.0
|
240 (69.4%)
|
|
112 (64.7%)
|
|
|
NLR
|
|
|
|
0.621
|
0.431
|
≤ 3.91
|
263 (76.0%)
|
|
126 (72.8%)
|
|
|
> 3.91
|
83 (24.0%)
|
|
47 (27.2%)
|
|
|
dNLR
|
|
|
|
0.692
|
0.405
|
≤ 2.46
|
254 (73.4%)
|
|
121 (69.9%)
|
|
|
> 2.46
|
92 (26.6%)
|
|
52 (30.1%)
|
|
|
LMR
|
|
|
|
0.479
|
0.489
|
≤ 3.4
|
141 (40.8%)
|
|
76 (43.9%)
|
|
|
> 3.4
|
205 (59.2%)
|
|
97 (56.1%)
|
|
|
PLR
|
|
|
|
0.055
|
0.815
|
≤ 208.89
|
277 (80.1%)
|
|
140 (80.9%)
|
|
|
> 208.89
|
69 (19.9%)
|
|
33 (19.1%)
|
|
|
SII
|
|
|
|
0.263
|
0.608
|
≤ 1141.96
|
294 (85.0%)
|
|
144 (83.2%)
|
|
|
> 1141.96
|
52 (15.0%)
|
|
29 (16.8%)
|
|
|
TP (g/L)
|
|
|
|
1.585
|
0.208
|
≤ 77.2
|
273 (78.9%)
|
|
128 (74.0%)
|
|
|
> 77.2
|
73 (1.1%)
|
|
45 (26.0%)
|
|
|
ALB (g/L)
|
|
|
|
0.148
|
0.701
|
≤ 42.4
|
132 (38.2%)
|
|
63 (36.4%)
|
|
|
> 42.4
|
214 (61.8%)
|
|
110 (63.6%)
|
|
|
GLOB (g/L)
|
|
|
|
0.095
|
0.758
|
≤ 33.1
|
274 (79.2%)
|
|
139 (80.3%)
|
|
|
> 33.1
|
72 (20.8%)
|
|
34 (19.7%)
|
|
|
AGR
|
|
|
|
1.406
|
0.236
|
≤ 1.36
|
108 (30.6%)
|
|
45 (26.0%)
|
|
|
> 1.36
|
240 (69.4%)
|
|
128 (74.0%)
|
|
|
CRP (mg/L)
|
|
|
|
0.087
|
0.768
|
≤ 5.47
|
268 (77.5%)
|
|
132 (76.3%)
|
|
|
> 5.47
|
78 (22.5%)
|
|
41 (23.7%)
|
|
|
CAR
|
|
|
|
0.101
|
0.751
|
≤ 0.16
|
282 (81.56%)
|
|
139 (80.3%)
|
|
|
> 0.16
|
64 (18.5%)
|
|
34 (19.7%)
|
|
|
APOA (g/L)
|
|
|
|
0.097
|
0.756
|
≤ 1.28
|
167 (48.3%)
|
|
81 (46.8%)
|
|
|
> 1.28
|
179 (51.7%)
|
|
92 (53.2%)
|
|
|
APOB (g/L)
|
|
|
|
0.262
|
0.609
|
≤ 1.03
|
218 (63.0%)
|
|
105 (60.7%)
|
|
|
> 1.03
|
128 (37.0%)
|
|
68 (39.3%)
|
|
|
ABR
|
|
|
|
0.038
|
0.845
|
≤ 0.96
|
40 (11.6%)
|
|
19 (11.0%)
|
|
|
> 0.96
|
306 (88.4%)
|
|
154 (89.0%)
|
|
|
LDH (U/L)
|
|
|
|
0.004
|
0.950
|
≤ 167.5
|
193 (55.8%)
|
|
96 (55.5%)
|
|
|
> 167.5
|
153 (44.2%)
|
|
77 (44.5%)
|
|
|
HDL (U/L)
|
|
|
|
1.114
|
0.291
|
≤ 1.16
|
179 (51.7%)
|
|
81 (46.8%)
|
|
|
> 1.16
|
167 (48.3%)
|
|
92 (53.2%)
|
|
|
Cys-C (mg/L)
|
|
|
|
1.640
|
0.200
|
≤ 0.94
|
222 (64.2%)
|
|
101 (58.4%)
|
|
|
> 0.94
|
124 (35.8%)
|
|
72 (41.6%)
|
|
|
EBV DNA, copy/mL
|
|
|
|
4.369
|
0.358
|
< 103
|
169 (48.8%)
|
|
70 (40.5%)
|
|
|
103-9,999
|
72 (20.8%)
|
|
36 (20.8%)
|
|
|
10 4-99,999
|
58 (16.8%)
|
|
39 (22.5%)
|
|
|
105-999,999
|
29 (8.4%)
|
|
17 (9.8%)
|
|
|
≥ 106
|
18 (5.2%)
|
|
11 (6.4%)
|
|
|
VCA-IgA
|
|
|
|
0.081
|
0.960
|
< 1:80
|
59 (17.1%)
|
|
28 (16.2%)
|
|
|
1:80–1:320
|
208 (60.1%)
|
|
106 (61.3%)
|
|
|
≥ 1:640
|
79 (22.8%)
|
|
39 (22.5%)
|
|
|
EA-IgA
|
|
|
|
1.338
|
0.512
|
< 1:10
|
116 (32.7%)
|
|
49 (28.3%)
|
|
|
1:10–1:20
|
110 (31.8%)
|
|
60 (34.7%)
|
|
|
≥ 1:40
|
123 (35.5%)
|
|
64 (37.0%)
|
|
|
ALI
|
|
|
|
0.173
|
0.677
|
≤ 262.33
|
94 (27.2%)
|
|
50 (28.9%)
|
|
|
> 262.33
|
252 (72.8%)
|
|
123 (71.1%)
|
|
|
PNI
|
|
|
|
0.058
|
0.810
|
≤ 47.35
|
63 (18.2%)
|
|
33 (19.1%)
|
|
|
> 47.35
|
283 (81.8%)
|
|
140 (80.9%)
|
|
|
PI
|
|
|
|
0.644
|
0.725
|
0
|
275 (79.5%)
|
|
141 (81.5%)
|
|
|
1
|
64 (18.5%)
|
|
30 (17.3%)
|
|
|
2
|
7 (2.0%)
|
|
2 (1.2%)
|
|
|
a: Smoking index: the number of cigarettes smoked each day × the year of cigarette smoking; |
b: TNM stage was classified according to the AJCC 8th TNM staging system; |
Abbreviations: BMI: body mass index; TNM: Tumor Node Metastasis stage; Rad: radiotherapy; Che: chemotherapy; WBC: white blood cell; HGB: hemoglobin; NLR: neutrophil/lymphocyte ratio; dNLR: neutrophil/WBC-neutrophil ratio; LMR: lymphocyte/monocyte ratio; PLR : platelet/lymphocyte ratio; SII: systemic immune-inflammation index; TP: total protein; ALB: albumin; GLOB: globulin; AGR: ALB/GLOB ratio; CRP: C-reactive protein; CAR: C-reactive protein/albumin ratio; APOA: apolipoprotein AI; APOB: apolipoprotein B; ABR: APOA/APOB ratio; LDH: lactic dehydrogenase; HDL: high density lipoprotein; Cys-C: cystatin C; EBV: Epstein-Barr virus; VCA-IgA: EBV immunoglobulin A/viral capsid antigen; EA-IgA: EBV immunoglobulin A/early antigen; ALI: advanced lung cancer inflammation index; PNI: prognostic nutritional index; PI: prognostic index. |
Construction of the novel prognostic model
To find the prognostic variables in the training cohort, we used a LASSO regression analysis model. Figure 1A showed the change in trajectory of each prognostic variable. Moreover, we plotted the partial likelihood deviance versus log (λ) in Figure 1B, where λ was the tuning parameter. The value of λ was 0.03987 and was chosen by 10-fold cross-validation via the 1-SE criteria. So, we obtained 13 variables with nonzero coefficients at the value λ chosen by the cross-validation. These prognostic variables included age, BMI, hemoglobin (HGB), platelet (PLT), lymphocyte-to-monocyte ratio (LMR), CRP, CRP- to- albumin ratio (CAR), globulin (GLOB), albumin- to- globulin ratio (AGR), LDH, cystatin C (Cys-C), advanced lung cancer inflammation index (ALI), and prognostic nutritional index (PNI). The coefficients of each prognostic variable were presented in Figure 1C. Then the prognostic model risk score for each patient was computed according to the summation of 13 variables multiplied by a coefficient generated from the LASSO regression: The prognostic model risk score = -0.680 + (0.569 × age) - (0.280 × BMI + (0.101 ×HGB) - (0.554 × PLT) + (0.197 ×LMR) - (0.199 ×CRP) + (0.186 ×CAR) + (1.248 ×GLOB) - (0.137 ×AGR) - (0.194 ×LDH) + (1.248 × Cys-C) - (0.137 ×ALI) - (0.194 ×PNI). Where each variable was valued as 0 or 1; a value of 0 was assigned when the variable was less than or equal to the corresponding cut-off value, and a value of 1 otherwise.
Predictive accuracy of the novel prognostic model, compared with TNM staging, clinical treatment, and EBV DNA copy number
As shown in Table 2, in the training cohort, the C-index of the prognostic model was 0.786 (95% confidence interval [CI]: 0.728-0.844), which was higher than the C-indices of the TNM staging (0.740, 95% CI: 0.690-0.790), clinical treatment (0.554, 95% CI: 0.521-0.586), and EBV DNA copy number (0.691, 95% CI: 0.623-0.758). The C-index of the prognostic model was significantly higher than the C-index of the clinical treatment (P < 0.001), and that of EBV DNA copy number (P = 0.013). In the validation cohort, the C-index of the prognostic model was higher than that of TNM staging and clinical treatment, but was a little lower than that of EBV DNA copy number. Subsequently, we compared the area under the ROC curve (AUC) between the novel prognostic model, TNM staging, clinical treatment, and EBV DNA copy number using tdROC. In general, the AUC our novel prognostic model was higher than the others, both in the training cohort (Figure 2A) and the validation cohort (Figure 2B). Finally, the DCA displayed the prognostic model had a better overall net benefit than that of TNM staging, clinical treatment, and EBV DNA copy number across a wide range of reasonable threshold probabilities in the training cohort (Figure 3A) and the validation cohort (Figure 3B). These results indicated that the novel prognostic model displayed better accuracy in predicting OS compared with TNM staging, clinical treatment, and EBV DNA copy number.
Table 2
The C-index of the prognostic model, TNM staging, Treatment, and EBV DNA for prediction of OS in the training cohort and validation cohort
Factors
|
|
C-index (95% CI)
|
P
|
For training cohort
|
|
|
Prognostic model
|
0.786 (0.728 ~ 0.844)
|
|
TNM staging
|
0.740 (0.690 ~ 0.790)
|
|
Treatment
|
0.554 (0.521 ~ 0.586)
|
|
EBV DNA
|
0.691 (0.623 ~ 0.758)
|
|
Prognostic model vs TNM staging
|
|
0.067
|
Prognostic model vs Treatment
|
|
< 0.001
|
Prognostic model vs EBV DNA
|
|
0.013
|
For validation cohort
|
|
|
Prognostic model
|
0.697 (0.612 ~ 0.734)
|
|
TNM staging
|
0.655 (0.575 ~ 0.734)
|
|
Treatment
|
0.529 (0.470 ~ 0.588)
|
|
EBV DNA
|
0.734 (0.659 ~ 0.813)
|
|
Prognostic model vs TNM staging
|
|
0.310
|
Prognostic model vs Treatment
|
|
< 0.001
|
Prognostic model vs EBV DNA
|
|
0.511
|
C-index = concordance index; CI = confidence interval; P values are calculated based on normal approximation using function rcorrp.cens in Hmisc package. |
Building and validating a predictive nomogram
The prognostic model risk score, TNM staging, clinical treatment, and EBV DNA copy number were integrated into nomograms to predict the 1-, 3-, and 5-year OS in the training cohort (Figure 4). Each variable was assigned a corresponding point value based on its contribution to the model. The point values for all the predictor variables are summed to arrive at the "total points" axis, and then a line is drawn vertically down from total points to predict the patient’s probability of OS at 1-, 3-, and 5-year. Finally, a calibration plot was used to visualize the performance of the nomogram. The nomogram-predicted outcomes for 1-, 3-, and 5-year OS were plotted on the x-axis, while the actual observed outcome on the y-axis. The 45° line represented the best prediction, the solid dark red line represented the performance of the nomograms. The calibration curve showed that the 1-, 3-, and 5-year OS predicted by the nomograms were consistent with actual observations (Figures 5), indicating that the nomograms performed well. The nomograms and calibration curve in the validation cohort were shown in Supplementary Figure 1 and Supplementary Figure 2, respectively.
Survival analyses of NPC patients according to prognostic model risk score
The optimal cut-off value of the prognostic model risk score for predicting survival was determined to be -1.423 by R package “survminer” (Figure 6A). We classified patients into two different subgroups based on the cut-off value: low-risk group (risk score ≤ -1.423), and high-risk group (risk score > -1.423). The distribution of the prognostic model risk score in the training and the validation cohort were shown in Figure 6B and Figure 6C, respectively.
In the training cohort, for the high-risk group, the median OS was 44.4 months (IQR: 24.7 – 66.1). The probabilities of OS at 1-, 3- and 5-year were 95.4%, 63.2%, and 33.3%, respectively. For the low-risk group, the median OS was 61.2 months (IQR: 44.6 – 67.8). The probabilities of OS at 1-, 3- and 5-year were 98.1%, 90.7%, and 53.3%, respectively. In the validation cohort, the low-risk group showed higher survival probabilities than did the high-risk group at 1-, 3-, and 5-year (Table 3). Kaplan–Meier curves were compared to assess the differences in survival between low-risk and high-risk groups. The low-risk group showed significantly longer OS than the high-risk group for both cohorts (P < 0.05; Figure 7).
Table 3
OS and OS rate in high-risk and low-risk groups according to the model risk score in the training and validation cohort
Parameter
|
Training cohort
|
|
Validation cohort
|
High-Risk Group
|
Low-Risk Group
|
Total
|
|
High-Risk Group
|
Low -Risk Group
|
Total
|
No. of patients
|
87
|
259
|
346
|
|
49
|
124
|
173
|
Median
(IQR)
|
44.4
(24.7–66.1)
|
61.2
(44.6–67.8)
|
51.4
(42.1–67.0)
|
|
45.8
(26.1–64.1)
|
53.5
(43.0-66.3)
|
50.4
(41.9–66.0)
|
No. of OS
|
|
|
|
|
|
|
|
1-Year
|
83 (95.4%)
|
254 (98.1%)
|
337 (97.4%)
|
|
44 (89.8%)
|
119 (96.0%)
|
163 (94.2%)
|
3-Year
|
55 (63.2%)
|
235 (90.7%)
|
290 (83.8%)
|
|
36 (73.5%)
|
110 (88.7%)
|
146 (84.4%)
|
5-Year
|
29 (33.3%)
|
138 (53.3%)
|
167 (48.3%)
|
|
17 (34.7%)
|
57 (46.0%)
|
74 (42.8%)
|
Abbreviations: OS: overall survival; IQR: interquartile range. |