Pathological features and baseline characteristics
The SEER database provided data on a total of 206,968 PDAC patients for potential inclusion in our study. Ultimately, 24,044 individuals were deemed suitable after a series of screening procedures, as delineated in Figure 1. Notably, of the eligible patients, 15,024 (62.49%) were classified as married and 9,020 (37.51%) as unmarried. Additional details regarding pathological features are elaborated in Table 1. Moreover, after executing the primary comparisons, significant differences were noted between the married and unmarried cohorts with regard to sex, race, TNM stage, and surgery status, with all values recorded as P≤0.001 (Table 1).
Table 1. Baseline characteristics of patients patients with PDAC based on marital status.
Characteristic
|
Married(N=15024)
|
Unmarried(N=9020)
|
P-value
|
Sex
|
Male
|
8915(59.3%)
|
3317(36.8%)
|
<0.001
|
Female
|
6109(40.7%)
|
5703(63.2%)
|
|
Age at diagnosis
|
<50 years
|
925 (6.2%)
|
560 (6.2%)
|
0.894
|
≥50years
|
14099(93.8%)
|
8460( 93.8%)
|
|
Race
|
White
|
12579(83.7%)
|
6948(77.0%)
|
<0.001
|
Black
|
1093(7.3%)
|
1468(16.3%)
|
|
American Indian/Alaska Native
|
64(0.4%)
|
48(0.5%)
|
|
Asian or Pacific Islander
|
1288(8.6%)
|
556(6.2%)
|
|
Grade
|
Well differentiated
|
1602(10.7%)
|
1046(11.6%)
|
0.153
|
Moderately differentiated
|
7012(46.7%)
|
4188(46.4%)
|
|
Poorly differentiated
|
6206(41.3%)
|
3661(40.6%)
|
|
Undifferentiated
|
204(1.4%)
|
125(1.4%)
|
|
TNM stage (6th)
|
I
|
1147(7.6%)
|
800(8.9%)
|
<0.001
|
II
|
7961(53.0%)
|
4526(50.2%)
|
|
III
|
1420(9.5%)
|
868(9.6%)
|
|
IV
|
4496(29.9%)
|
2826(31.3%)
|
|
Surgery
|
Yes
|
8401(55.9%)
|
4403(48.8%)
|
<0.001
|
No
|
6623(44.1%)
|
4617(51.2%)
|
|
The primary comparison assessed the impact of marital status on OS and CSS
In univariate Cox regression analysis, mortality rates associated with PDAC were demonstrated to be significantly linked with seven variables, including sex, age, race, grade, TNM stage, primary site surgery, and marital status for both OS and CSS (P < 0.05; Table 2). Upon conducting multivariate Cox regression analysis to further investigate survival factors, we found that marital status, as well as sex, race, grade, TNM stage, and surgery status, emerged as independent prognostic factors that significantly influenced OS and CSS outcomes in patients with PDAC (P < 0.001; Table 3).
Table 2. Univariate analysis to assess the impact of marital status on OS/CSS in PDAC.
variables
|
OS
|
CSS
|
HR(95%CI)
|
P-value
|
HR(95%CI)
|
P-value
|
Sex
|
|
|
|
|
Male
|
Reference
|
|
Reference
|
|
Female
|
0.952(0.928-0.977)
|
<0.001
|
0.958(0.932-0.984)
|
0.002
|
Age at diagnosis
|
|
|
|
|
<50 years
|
Reference
|
|
Reference
|
|
≥50years
|
1.139(1.078-1.203)
|
<0.001
|
1.095(1.036-1.158)
|
0.001
|
Race
|
|
|
|
|
White
|
Reference
|
|
Reference
|
|
Black
|
1.121(1.074-1.170)
|
<0.001
|
1.103(1.055-1.153)
|
<0.001
|
American Indian/Alaska Native
|
1.213(1.005-1.464)
|
0.045
|
1.252(1.035-1.516)
|
0.021
|
Asian or Pacific Islander
|
0.966(0.919-1.016)
|
0.178
|
0.980(0.931-1.031)
|
0.432
|
Grade
|
|
|
|
|
Well differentiated
|
Reference
|
|
Reference
|
|
Moderately differentiated
|
1.130(1.080-1.181)
|
<0.001
|
1.145(1.093-1.200)
|
<0.001
|
Poorly differentiated
|
1.590(1.520-1.664)
|
<0.001
|
1.629(1.554-1.707)
|
<0.001
|
Undifferentiated
|
1.753(1.558-1.972)
|
<0.001
|
1.801(1.586-2.034)
|
<0.001
|
TNM stage (6th)
|
|
|
|
|
I
|
Reference
|
|
Reference
|
|
II
|
1.336(1.267-1.408)
|
<0.001
|
1.411(1.333-1.493)
|
<0.001
|
III
|
2.277(2.134-2.430)
|
<0.001
|
2.472(2.308-2.648)
|
<0.001
|
IV
|
4.003(3.786-4.232)
|
<0.001
|
4.391(4.138-4.660)
|
<0.001
|
Surgery
|
|
|
|
|
No
|
Reference
|
|
Reference
|
|
Yes
|
0.315(0.306-0.324)
|
<0.001
|
0.303(0.294-0.312)
|
<0.001
|
Marital status
|
|
|
|
|
Unmarried
|
Reference
|
|
Reference
|
|
Married
|
0.842(0.820-0.865)
|
<0.001
|
0.852(0.829-0.876)
|
<0.001
|
Table 3. Multivariate analysis to assess the impact of marital status on OS/CSS in PDAC.
variables
|
OS
|
CSS
|
HR(95%CI)
|
P-value
|
HR(95%CI)
|
P-value
|
Sex
|
|
|
|
|
Male
|
Reference
|
|
Reference
|
|
Female
|
0.942(0.917-0.967)
|
<0.001
|
0.951(0.925-0.978)
|
<0.001
|
Age at diagnosis
|
|
|
|
|
<50 years
|
Reference
|
|
Reference
|
|
≥50years
|
1.241(1.175-1.312)
|
<0.001
|
1.197(1.132-1.267)
|
<0.001
|
Race
|
|
|
|
|
White
|
Reference
|
|
Reference
|
|
Black
|
1.039(0.995-1.084)
|
0.085
|
1.022(0.977-1.069)
|
0.348
|
American Indian/Alaska Native
|
1.075(0.890-1.297)
|
0.454
|
1.100(0.909-1.332)
|
0.326
|
Asian or Pacific Islander
|
0.969(0.921-1.018)
|
0.212
|
0.979(0.930-1.031)
|
0.429
|
Grade
|
|
|
|
|
Well differentiated
|
Reference
|
|
Reference
|
|
Moderately differentiated
|
1.257(1.202-1.315)
|
<0.001
|
1.276(1.218-1.338)
|
<0.001
|
Poorly differentiated
|
1.585(1.515-1.659)
|
<0.001
|
1.618(1.543-1.697)
|
<0.001
|
Undifferentiated
|
1.494(1.328-1.681)
|
<0.001
|
1.520(1.346-1.716)
|
<0.001
|
TNM stage (6th)
|
|
|
|
|
I
|
Reference
|
|
Reference
|
|
II
|
1.432(1.358-1.510)
|
<0.001
|
1.518(1.434-1.607)
|
<0.001
|
III
|
1.447(1.353-1.548)
|
<0.001
|
1.551(1.445-1.665)
|
<0.001
|
IV
|
2.206(2.079-2.341)
|
<0.001
|
2.380(2.235-2.536)
|
<0.001
|
Surgery
|
|
|
|
|
No
|
Reference
|
|
Reference
|
|
Yes
|
0.392(0.378-0.407)
|
<0.001
|
0.382(0.367-0.397)
|
<0.001
|
Marital status
|
|
|
|
|
Unmarried
|
Reference
|
|
Reference
|
|
Married
|
0.840(0.817-0.864)
|
<0.001
|
0.851(0.826-0.876)
|
<0.001
|
The secondary comparison assessed the impact of marital status on both OS and CSS
To eliminate for potential confounding variables such as age, sex, and race between the married and unmarried groups, we employed the 1:1 propensity score matching method. After matching, 8043 married patients and an equal number of unmarried patients (for a total of 8043 individuals) were successfully enrolled. Notably, the baseline characteristics were found to be well-balanced between the two groups (Table 4; Figure 2), and no significant differences were observed (P > 0.05).
Table 4. Baseline characteristics of patients patients with PDAC based on marital status after propensity‑score matching.
Characteristic
|
Married(N=8043)
|
Unmarried(N=8043)
|
P-value
|
Sex
|
Male
|
3248 (40.4%)
|
3248 (40.4%)
|
1
|
Female
|
4795 (59.6%)
|
4795 (59.6%)
|
|
Age at diagnosis
|
<50 years
|
470 (5.8%)
|
506 (6.3%)
|
0.248
|
≥50years
|
7573 (94.2%)
|
7537 (93.7%)
|
|
Race
|
White
|
6570 (81.7%)
|
6570 (81.7%)
|
1
|
Black
|
916 (11.4%)
|
916 (11.4%)
|
|
American Indian/Alaska Native
|
25 (0.3%)
|
25 (0.3%)
|
|
Asian or Pacific Islander
|
532 (6.6%)
|
532 (6.6%)
|
|
Grade
|
Well differentiated
|
893 (11.1%)
|
885 (11.0%)
|
0.997
|
Moderately differentiated
|
3723 (46.3%)
|
3724 (46.3%)
|
|
Poorly differentiated
|
3333 (41.4%)
|
3341 (41.5%)
|
|
Undifferentiated
|
94 (1.2%)
|
93 (1.2%)
|
|
TNM stage (6th)
|
I
|
620 (7.7%)
|
620 (7.7%)
|
0.998
|
II
|
4148 (51.6%)
|
4148 (51.6%)
|
|
III
|
753 (9.4%)
|
760 (9.4%)
|
|
IV
|
2522 (31.4%)
|
2515 (31.3%)
|
|
Surgery
|
Yes
|
3869 (48.1%)
|
3869 (48.1%)
|
1
|
No
|
4174 (51.9%)
|
4174 (51.9%)
|
|
The findings indicate that, with the exception of race, all baseline characteristics were significant predictors of both OS and CSS (Table 5). In the univariate analysis after propensity-score matching, being unmarried (with reference to married) remained a statistically significant predictive risk factor of death (OS: HR = 0.870, 95% CI = 0.842–0.898, P<0.001; CSS: HR=0.882, 95% CI=0.853–0.912, P< 0.001). Upon subjecting relevant variables to further multivariate analysis, all components maintained independent significance in predicting OS/CSS with the exception of sex (P=0.191). Moreover, unmarried status (with reference to married) exhibited a noteworthy negative influence on survival outcomes (OS: HR=0.834, 95% CI=0.808–0.862, P<0.001; CSS: HR = 0.845, 95% CI = 0.817–0.873, P < 0.001; Table 2). It is worth noting that patients diagnosed prior to age 50, those with stage I cancer, well-differentiated tumors, and those who had undergone surgery were observed to be more likely to experience an improvement in both OS and CSS compared to their respective reference groups (Table 5).
Table 5. Univariate and multivariate analysis of the impact of marital status on survival outcomes in PDAC.
variables
|
OS
|
CSS
|
Univariate analysis
|
Multivariate analysis
|
Univariate analysis
|
Multivariate analysis
|
HR(95%CI)
|
P-value
|
HR(95%CI)
|
P-value
|
HR(95%CI)
|
P-value
|
HR(95%CI)
|
P-value
|
Sex
|
Male
|
Reference
|
|
Reference
|
|
Reference
|
|
Reference
|
|
Female
|
0.924(0.894-0.955
|
<0.001
|
0.970(0.939-1.002)
|
0.069
|
0.927(0.897-0.959)
|
<0.001
|
0.978(0.945-1.011)
|
0.191
|
Age at diagnosis
|
<50 years
|
Reference
|
|
Reference
|
|
Reference
|
|
Reference
|
|
≥50years
|
1.130(1.055-1.209)
|
<0.001
|
1.247(1.165-1.335)
|
<0.001
|
1.086(1.014-1.164)
|
0.019
|
1.202(1.121-1.288)
|
<0.001
|
Race
|
|
|
|
|
|
|
|
|
White
|
Reference
|
|
/
|
|
Reference
|
|
/
|
|
Black
|
1.090(1.036-1.146)
|
0.001
|
|
|
1.081(1.026-1.140)
|
0.004
|
|
|
American Indian/Alaska Native
|
1.031(0.777-1.369)
|
0.833
|
|
|
1.052(0.787-1.405)
|
0.732
|
|
|
Asian or Pacific Islander
|
0.998(0.935-1.065)
|
0.953
|
|
|
1.020(0.954-1.090)
|
0.560
|
|
|
Grade
|
Well differentiated
|
Reference
|
|
Reference
|
|
Reference
|
|
Reference
|
|
Moderately differentiated
|
1.094(1.036-1.155)
|
0.001
|
1.233(1.167-1.302)
|
<0.001
|
1.108(1.047-1.173)
|
<0.001
|
1.254(1.185-1.328)
|
<0.001
|
Poorly differentiated
|
1.545(1.462-1.632)
|
<0.001
|
1.543(1.460-1.631)
|
<0.001
|
1.589(1.501-1.683)
|
<0.001
|
1.585(1.496-1.679)
|
<0.001
|
Undifferentiated
|
1.666(1.428-1.944)
|
<0.001
|
1.381(1.184-1.612)
|
<0.001
|
1.699(1.449-1.992)
|
<0.001
|
1.398(1.192-1.641)
|
<0.001
|
TNM stage (6th)
|
I
|
Reference
|
|
Reference
|
|
Reference
|
|
Reference
|
|
II
|
1.321(1.237-1.410)
|
<0.001
|
1.414(1.324-1.510)
|
<0.001
|
1.403(1.307-1.506)
|
<0.001
|
1.509(1.405-1.620)
|
<0.001
|
III
|
2.331(2.150-2.527)
|
<0.001
|
1.412(1.298-1.536)
|
<0.001
|
2.549(2.340-2.776)
|
<0.001
|
1.518(1.389-1.659)
|
<0.001
|
IV
|
4.054(3.783-4.344)
|
<0.001
|
2.188(2.032-2.356)
|
<0.001
|
4.473(4.154-4.816)
|
<0.001
|
2.367(2.188-2.561)
|
<0.001
|
Surgery
|
No
|
Reference
|
|
Reference
|
|
Reference
|
|
Reference
|
|
Yes
|
0.312(0.302-0.323)
|
<0.001
|
0.387(0.369-0.406)
|
<0.001
|
0.299(0.288-0.310)
|
<0.001
|
0.375(0.357-0.394)
|
<0.001
|
Marital status
|
Unmarried
|
Reference
|
|
Reference
|
|
Reference
|
|
Reference
|
|
Married
|
0.870(0.842-0.898)
|
<0.001
|
0.834(0.808-0.862)
|
<0.001
|
0.882(0.853-0.912)
|
<0.001
|
0.845(0.817-0.873)
|
<0.001
|
The Kaplan-Meier curves presented in Figure 3 indicate that unmarried individuals have a significantly lower survival rate than married individuals (P<0.001). To further investigate the prognosis of different unmarried statuses, we grouped unmarried patients into separated/divorced, single, and widowed subgroups. As shown in Figure 4, we found that there was a significant difference between their OS/CSS and different marital statuses (P<0.001).
In the secondary comparison, we utilized a forest plot to evaluate the impact of different kinds of unmarried statuses versus married status. As illustrated in Figure 5, separated/divorced patients (OS: aHR = 1.134, 95% CI =1.082–1.189, P < 0.001; CSS: aHR = 1.119, 95% CI =1.066–1.175, P < 0.001), single patients (OS: aHR = 1.142, 95% CI =1.091–1.196, P<0.001; CSS: aHR = 1.140, 95% CI =1.087–1.195, P<0.001), and widowed patients (OS: aHR = 1.319, 95% CI =1.261–1.377, P<0.001; CSS: aHR = 1.291, 95% CI =1.233–1.352, P<0.001) exhibit poorer survival outcomes relative to married patients. Additionally, we observed that widowed patients have the highest risk of death among the three unmarried statuses (Figure 4 and 5).
Machine-learning based outcome prediction in patients who married
To explore the factors that influence the survival of married patients with PDAC, we utilized age, sex, race, tumor differentiation, TNM stage, and surgery status as input parameters for developing machine learning prediction models of the 5-year CSS and 5-year OS. The performance metrics of the algorithms for the four models are presented in Table 6. Among the machine learning models, the random forest model exhibits superior discrimination performance. For predicting the 5-year CSS, the random forest model achieves an AUROC of 0.734, accuracy of 0.592, recall of 0.552, specificity of 0.806, precision of 0.939, and F1 score of 0.695. The 5-year OS results are 0.795, 0.572, 0.536, 0.940, 0.989, and 0.695 for AUROC, accuracy, recall, specificity, precision, and F1 score, respectively. Artificial neural network, naïve bayes, and k-nearest neighbor follow with AUROCs of 0.788, 0.771, and 0.708, respectively. Receiver operating characteristics (ROC) curves and AUROCs of the four models are displayed in Figure 6.
Table 6. Discrimination tests of four machine learning models for predicting 5-year CSS and 5-year OS.
Algorithm
|
Discrimination tests
|
AUROC (95% CI)
|
Accuracy (95% CI)
|
Recall (95% CI)
|
Specificity (95% CI)
|
Precision (95% CI)
|
F1-score (95% CI)
|
5-CSS
|
|
|
|
|
|
|
K-nearest neighbor
|
0.670
(0.644-0.695)
|
0.736
(0.736-0.736)
|
0.789
(0.774-0.805)
|
0.449
(0.404-0.494)
|
0.886
(0.873-0.899)
|
0.835
(0.821-0.849)
|
Artificial neural network
|
0.732
(0.707-0.756)
|
0.586
(0.586-0.586)
|
0.544
(0.525-0.564)
|
0.812
(0.777-0.847)
|
0.940
(0.928-0.952)
|
0.689
(0.671-0.708)
|
Naïve Bayes
|
0.725
(0.701-0.749)
|
0.566
(0.566-0.566)
|
0.519
(0.499-0.538)
|
0.821
(0.786-0.855)
|
0.940
(0.928-0.952)
|
0.669
(0.649-0.687)
|
Random forest
|
0.734
(0.709-0.758)
|
0.592
(0.591-0.592)
|
0.552
(0.533-0.571)
|
0.806
(0.77-0.841)
|
0.939
(0.927-0.951)
|
0.695
(0.677-0.714)
|
5-OS
|
|
|
|
|
|
|
K-nearest neighbor
|
0.708
(0.676-0.74)
|
0.678
(0.677-0.678)
|
0.676
(0.659-0.694)
|
0.689
(0.634-0.745)
|
0.957
(0.948-0.966)
|
0.792
(0.778-0.808)
|
Artificial neural network
|
0.788
(0.764-0.812)
|
0.570
(0.57-0.571)
|
0.535
(0.517-0.554)
|
0.929
(0.898-0.96)
|
0.987
(0.981-0.993)
|
0.694
(0.677-0.711)
|
Naïve Bayes
|
0.771
(0.748-0.794)
|
0.579
(0.579-0.579)
|
0.544
(0.525-0.562)
|
0.940
(0.912-0.969)
|
0.989
(0.984-0.995)
|
0.702
(0.685-0.718)
|
Random forest
|
0.795
(0.771-0.818)
|
0.572
(0.572-0.572)
|
0.536
(0.517-0.555)
|
0.940
(0.912-0.969)
|
0.989
(0.984-0.994)
|
0.695
(0.678-0.712)
|
Table 7.Calibration tests of four machine learning models for predicting 5-year CSS and 5-year OS.
Algorithm
|
Calibration
|
Brier score
|
Slope
|
Intercept
|
5-CSS
|
|
|
|
K-nearest neighbor
|
0.125
|
0.687
|
0.462
|
Artificial neural network
|
0.118
|
0.971
|
-0.051
|
Naïve Bayes
|
0.134
|
0.398
|
0.966
|
Random forest
|
0.118
|
0.991
|
0.022
|
5-OS
|
|
|
|
K-nearest neighbor
|
0.080
|
0.601
|
0.852
|
Artificial neural network
|
0.073
|
0.996
|
0.054
|
Naïve Bayes
|
0.106
|
0.304
|
1.498
|
Random forest
|
0.072
|
1.100
|
-0.172
|
The calibration curves demonstrated an excellent agreement between predictions and observations (Figure 7). For predicting the 5-year CSS, the k-nearest neighbor, artificial neural network, naïve bayes and random forest models gave brier scores of 0.125, 0.118, 0.134, and 0.118, respectively. Similarly, while the 5-year OS, brier scores of 0.080, 0.073, 0.106, and 0.072 were obtained using the same models, as outlined in Table 7.
In this study, the clinical effectiveness of four predictive models was assessed using decision curves and clinical impact curves. The DCA curve (Figure 8) indicated that the random forest model had a greater net benefit compared to the "treat none" or "treat all" schemes across a threshold probability range of 0.6 to 1.0. Further, the random forest model exhibited superior clinical impact when compared to the other models. Notably, when the threshold probability was set above 75% (Figure 9), the number of positive cases predicted by the models (i.e., those at high risk) was closely matched the number of true-positive cases (i.e., those who actually had high-risk outcomes). Considering all four evaluation metrics, it can be concluded that the random forest algorithm performed the best for prediction purposes and could offer more precise and systematic treatment guidance and support to married patients with PDAC.