Baseline Characteristics and Postoperative Pathological Upgrading
This study included a total of 65,574 PCa patients from the SEER database. They were randomly divided into training (45,903 cases) and internal validation (19,671 cases) groups at a ratio of 7:3. Additionally, 130 patients with PCa who underwent RP at Zhongshan People's Hospital between 2018 and 2023 were included. After excluding 32 patients who did not meet the inclusion criteria, 98 patients were included in the external validation study. The process is illustrated in Fig. 1. In Table 1, the patient characteristics for the training, internal validation, and external validation groups were presented. Among these, 11,931 patients (25.9%) in the training group, 5,112 (25.9%) in the internal validation group, and 24 (24.4%) in the external validation group experienced post-RP GGU.
Table 1
Clinical characteristics of patients.
Characteristics | Training group(n = 45,903) | Internal validation group (n = 19,671) | External validation group (n = 98) |
Age (%) | | | |
<60 | 2,610 (5.7) | 1,129 (5.7) | 7 (7.1) |
60 ~ 69 | 2,973 (6.5) | 1,316 (6.7) | 36 (36.7) |
>=70 | 40,320 (87.8) | 17,226 (87.6) | 55 (56.1) |
Race (%) | | | |
White | 35,983 (78.4) | 15,462 (78.6) | - |
Black | 7,075 (15.4) | 2,999 (15.2) | - |
Asian or Pacific | 2,669 (5.8) | 1,136 (5.8) | 98 (100.0) |
American or Alaska | 176 (0.4) | 74 (0.4) | - |
Marital (%) | | | |
Married | 36,483 (79.5) | 15,730 (80.0) | 96 (98.0) |
Single | 4,967 (10.8) | 2,085 (10.6) | - |
Divorced or Separated | 3,462 (7.5) | 1,454 (7.4) | 2 (2.0) |
Other | 991 (2.2) | 402 (2.0) | - |
Gleason Patterns Clinical (%) | | | |
Group1 | 13,245 (28.9) | 5,679 (28.9) | 10 (10.2) |
Group2 | 16,208 (35.3) | 6,902 (35.1) | 18 (18.4) |
Group3 | 7,827 (17.1) | 3,389 (17.2) | 20 (20.4) |
Group4 | 5,268 (11.5) | 2,280 (11.6) | 30 (30.6) |
Group5 | 3,355 (7.3) | 1,421 (7.2) | 20 (20.4) |
Gleason Patterns Pathological (%) | | | |
Group1 | 8,194 (17.9) | 3,503 (17.8) | 5 (5.1) |
Group2 | 21,745 (47.4) | 9,359 (47.6) | 29 (29.6) |
Group3 | 9,489 (20.7) | 4,029 (20.5) | 24 (24.5) |
Group4 | 2,535 (5.5) | 1,139 (5.8) | 15 (15.3) |
Group5 | 3,940 (8.6) | 1,641 (8.3) | 25 (25.5) |
PSA (%) | | | |
<4 | 5,106 (11.1) | 2,189 (11.1) | 5 (5.1) |
4 ~ 10 | 32,416 (70.6) | 13,892 (70.6) | 31 (31.6) |
11 ~ 19 | 5,476 (11.9) | 2,318 (11.8) | 23 (23.5) |
>=20 | 2,905 (6.3) | 1,272 (6.5) | 39 (39.8) |
Positive (%) | | | |
<5 | 23,485 (51.2) | 10,103 (51.4) | 46 (46.9) |
5 ~ 9 | 17,404 (37.9) | 7,415 (37.7) | 32 (32.7) |
>=10 | 5,014 (10.9) | 2,153 (10.9) | 20 (20.4) |
Examined = >=10 (%) | 41,310 (90.0) | 17,701 (90.0) | 98 (100.0) |
Percent = >=33% (%) | 27,789 (60.5) | 11,819 (60.1) | 52 (53.1) |
Univariable and Multivariable Logistic Regression Analysis
Subsequently, univariate logistic regression analysis revealed significant associations (P < 0.05) between GGU and variables such as age, race, marital status, preoperative PSA level, needle biopsy ISUP grading group, total number of biopsy cores, number of positive cores, and percentage of positive cores among PCa patients. Multivariate logistic regression analysis was conducted on variables with P-values < 0.05. The results indicated that, except for marital status, all other variables were independent prognostic risk factors for GGU in patients with PCa (Table 2). Additionally, increasing age and preoperative PSA levels were associated with an increased likelihood of pathological upgrading, especially when age was ≥ 70 years and preoperative PSA was ≥ 20 ng/mL (OR 1.85; P < 0.001 and OR 2.70; P < 0.001/ng/mL).
Table 2
Univariate Analysis and Multivariate Logistic Regression Analysis of Variables
Characteristics | Univariate | | Multivariate |
OR (95%CI) | P-Value | | OR (95%CI) | P-Value |
Age | | | | | |
< 60 | Reference | | | Reference | |
60 ~ 69 | 1.20(1.06–1.35) | 0.004 | | 1.41(1.24–1.61) | < 0.001 |
>=70 | 1.12(1.02–1.23) | 0.016 | | 1.85(1.67–2.05) | < 0.001 |
Race | | | | | |
White | Reference | | | Reference | |
Black | 0.93(0.88–0.99) | 0.015 | | 0.88(0.83–0.94) | < 0.001 |
Asian or Pacific | 1.06(0.96–1.16) | 0.237 | | 1.13(1.02–1.25) | 0.017 |
American or Alaska | 1.05(0.74–1.50) | 0.774 | | 0.88(0.59–1.29) | 0.499 |
Marital | | | | | |
Married | Reference | | | Reference | |
Single | 0.97(0.91–1.04) | 0.397 | | 0.98(0.91–1.05) | 0.523 |
Divorced Separated | 0.91(0.84–0.99) | 0.031 | | 0.93(0.85–1.01) | 0.097 |
Other | 0.91(0.79–1.06) | 0.229 | | 0.98(0.83–1.15) | 0.805 |
Gleason Patterns Clinical | | | | | |
group1 | Reference | | | Reference | |
group2 | 0.24(0.22–0.25) | < 0.001 | | 0.20(0.18–0.21) | < 0.001 |
group3 | 0.14(0.13–0.16) | < 0.001 | | 0.11(0.10–0.12) | < 0.001 |
group4 | 0.21(0.20–0.23) | < 0.001 | | 0.15(0.14–0.16) | < 0.001 |
PSA | | | | | |
< 4 | Reference | | | Reference | |
4 ~ 10 | 1.08(1.00-1.15) | 0.037 | | 1.30(1.20–1.40) | < 0.001 |
11 ~ 19 | 1.33(1.22–1.46) | < 0.001 | | 2.09(1.89–2.30) | < 0.001 |
>=20 | 1.50(1.34–1.67) | < 0.001 | | 2.70(2.40–3.04) | < 0.001 |
Positive | | | | | |
< 5 | Reference | | | Reference | |
5 ~ 9 | 0.85(0.81–0.89) | < 0.001 | | 1.06(0.99–1.14) | 0.113 |
>=10 | 0.93(0.86-1.00) | 0.042 | | 1.25(1.13–1.38) | < 0.001 |
Examined | | | | | |
< 10 | Reference | | | Reference | |
>=10 | 0.83(0.78–0.89) | < 0.001 | | 0.90(0.83–0.98) | 0.013 |
Percent | | | | | |
< 33% | Reference | | | Reference | |
>=33% | 0.92(0.88–0.96) | < 0.001 | | 1.25(1.16–1.34) | < 0.001 |
Comparison and Validation of Model Performance
Five supervised ML algorithms were employed to predict post-RP pathological GGU status, establishing binary models where labels were set as upgraded or otherwise. Surprisingly, all our ML models demonstrated relatively stable consistency, including LR, GBM, NNET, RF, and XGB, with their AUC values exceeding 0.7, namely 0.722, 0.726, 0.727, 0.703, and 0.728, respectively, as depicted in Fig. 2A. Despite minor differences in performance, the XGB model exhibited superior and stable results across both the training and internal validation groups. In the training group, the sensitivity, specificity, and AUC were 0.614, 0.760, and 0.728, respectively. Similarly, in the internal validation group, the XGB model maintained consistent performance with an AUC of 0.741, a sensitivity of 0.622, and a specificity of 0.767 (Table 3).
Table 3
Predictive performance of each model.
| Training group | Internal validation group |
| Sensitivity | Specificity | AUC | Sensitivity | Specificity | AUC |
LR | 0.594 | 0.774 | 0.722 | 0.603 | 0.781 | 0.736 |
GBM | 0.627 | 0.746 | 0.726 | 0.635 | 0.752 | 0.739 |
NNET | 0.617 | 0.755 | 0.727 | 0.626 | 0.763 | 0.741 |
RF | 0.573 | 0.790 | 0.703 | 0.577 | 0.791 | 0.712 |
XGB | 0.614 | 0.760 | 0.728 | 0.622 | 0.767 | 0.741 |
Calibration curves and DCA were constructed for the five models used in our study (Figs. 2B, C, E, F). In our ML models, except for the RF model which showed a more pronounced deviation from the 45-degree line, the calibration curves fit well. In our DCA, the y-axis represents the net benefit, used to assess whether any specific clinical decision was more beneficial or harmful. Each point on the x-axis represents the threshold probability for distinguishing between patients with and without GGU. The results indicated that all models achieved a net clinical benefit compared to the all-or-none strategy, with the XGB model seemingly having the highest net benefit across the entire range of threshold probabilities, especially when the risk threshold was below 80%. Furthermore, we compared the accuracy of the ML models with that of the nomograms. The ROC curves of the training and internal validation groups showed that the accuracy of the XGB model was higher than that of the nomograms (training group AUC: 0.728 vs. 0.722; internal validation group AUC: 0.741 vs. 0.736), as shown in Fig. 3. Subsequently, we conducted an external validation of the XGB model using an external validation group and constructed the ROC, DCA, and calibration curves. DCA indicated that the XGB model still had a higher net benefit at risk thresholds of 70–90% (Fig. 4).
Development of a Web-Based Predictor for GGU in Prostate Cancer Patients
Finally, based on the XGB model, we developed a web-based predictor of GGU in PCa patients, providing a practical tool for clinicians to assess the risk of post-RP GGU. This predictor enhances clinical decision-making by allowing clinical physicians to quickly and accurately estimate a patient's likelihood of GGU, facilitating personalized treatment planning and potentially reducing the need for unnecessary interventions. By integrating easily accessible clinical data, this web-based predictor can be seamlessly incorporated into routine clinical practice, making it a valuable resource for improving patient outcomes and optimizing resource allocation in PCa management (Fig. 5) (https://alao-riskmodel.shinyapps.io/workrun7/).