A total of 16,144 unique patients who visited the polyclinic for DHL between April 1, 2014 and March 31, 2015 was initially included in the dataset. 6,085 of them developed any one of the complications prior to the base visit date and were removed from the final dataset. The characteristics of the 10,059 remaining patients used in study are presented in Table 4.
Table 4
Characteristics and complication rate of patients in the final dataset.
|
n = 10,059
|
Demographics
|
Age (years), mean (SD)
|
63.2 (11.3)
|
Male, n (%)
|
4131 (41.1)
|
Race, n (%)
• Chinese
• Malay
• Indian
• Others
|
8455 (84.1)
635 (6.3)
532 (5.3)
437 (4.3)
|
Medical conditions
|
Diabetes only, n (%)
|
150 (1.5)
|
Hypertension only, n (%)
|
1501 (14.9)
|
Hyperlipidaemia only, n (%)
|
2223 (22.1)
|
Diabetes & Hypertension, n (%)
|
149 (1.5)
|
Diabetes & Hyperlipidaemia, n (%)
|
315 (3.1)
|
Hypertension & Hyperlipidaemia, n (%)
|
4133 (41.1)
|
Diabetes, Hypertension & Hyperlipidaemia, n (%)
|
1588 (15.8)
|
Biomarkers
|
Body mass index (kg/m2), mean (SD)
|
25.2 (4.5)
|
Systolic BP (mmHg), mean (SD)
|
129.8 (17.7)
|
Diastolic BP (mmHg), mean (SD)
|
70.6 (10.8)
|
Complications within five years after base visit date
|
Eye complication, n (%)
|
1180 (11.7)
|
Foot complication, n (%)
|
117 (1.2)
|
Kidney complication, n (%)
|
811 (8.1)
|
Macrovascular complication, n (%)
|
1119 (11.1)
|
Any DHLa complication, n (%)
|
2590 (25.7)
|
a DHL = Diabetes, Hypertension and Hyperlipidemia. |
Patients in the dataset had a mean age of 63.2 ± 11.3 years with a higher proportion of females (59.9%). The cohort also had a bias towards the combination of Hypertension and Hyperlipidemia (41.1%). The second most prevalent condition among the cohort of patient is Hyperlipidemia (22.1%), followed by the Diabetes, Hypertension and Hyperlipidemia combination (15.8%). A total of 2,509 (25.7%) patients in this study cohort developed at least one complication within five years after the base visit, with eye complications (11.7%) being the most common type.
With an initial K value of 5, the patient similarity model achieved an AUROC of 0.688 (0.667 to 0.709) in predicting DHL complications. The grid search yielded the best K value of 10, and the patient similarity model achieved an AUROC of 0.718 (0.697 to 0.739) (see Table 5). Compared with the other models, the patient similarity-based model was shown to be more accurate than logistic regression (AUROC = 0.695), and slightly less accurate as the support vector machine (AUROC = 0.766) and random forest model (AUROC = 0.764) models.
Table 5
Comparison of patient similarity model performance with other models.
Model
|
AUROC (95% CI)
|
Patient similarity (K = 10)
|
0.718 (0.697 to 0.739)
|
Logistic regression
|
0.695 (0.672 to 0.718)
|
Random forest
|
0.764 (0.744 to 0.784)
|
Support vector machine (kernel = linear)
|
0.766 (0.746 to 0.785)
|
Patient similarity model explainability and interpretability
The patient similarity model was implemented as a web application to allow users to enter details about a new patient and to generate an estimated risk of DHL complications (see Fig. 1).
In terms of explainability, this approach is transparent in how it generates its risk predictions. The first step is to perform a multi-dimensional search across 69 variables, with importance weights applied, to find the ten most similar patients, based on Euclidean distance. The next step is to then aggregate the known outcomes of these ten patients from the database to compute the risk. For example, if four out of the ten patients had a DHL complication, the estimated risk for the new patient would be 40%.
In terms of interpretability, for the same example above, the predicted risk can be understood by patients as “based on the ten most similar patients to myself, four in ten of them had a DHL complication within the next 5 years”. Furthermore, with the ability to pinpoint who the ten most similar patients are, healthcare providers can select a particular similar patient to view his/her longitudinal medical history over the subsequent five years. This could be used as a basis for crafting a more compelling narrative to deliver prognostic information.
Case study
To illustrate how the web application can be used, we conducted mock consultation with a young patient with poorly controlled diabetes (Patient X). We entered relevant details of Patient X in the web application. Patient X was 40 years old with pre-existing Diabetes, Hypertension and Hyperlipidemia for 4 years, 5 years and 5 years respectively. He had poorly controlled Diabetes with HbA1c of 10.0%. He was taking metformin (total daily dose [TDD]: 2000mg), and glipizide (TDD: 20mg), lisinopril (TDD: 20mg), amlodipine (TDD: 10mg) and atorvastatin (TDD: 20mg) (see Fig. 2).
The backend system would identify the top-10 most similar patients from the database of 10,059 patients and display them as a list of anonymised records (see Fig. 3). In this case, among the top-10 most similar patients to Patient X, four of them had developed a complication. This can be interpreted by Patient X to be “for the 10 most similar patients to myself, four had a DHL complication in the next five years.” The attending doctor would leverage on such prognostic information to prompt Patient X to take action to optimize his/her glycemic control.
Going one step further, the system also allows the attending doctor to select a particular similar patient to generate a timeline. In this case, the attending doctor selects Patient #10845 who is a 59 year old with Diabetes, Hypertension and Hyperlipidemia each for 5 years. Patient #10845 also has poorly controlled Diabetes with HbA1c of 10.1%. From the timeline, it shows Patient #10845 starting Insulin Glargine and later increasing the dose of the medication to eventually achieve good glycemic control and staved off all complications (see Fig. 4). Using this timeline information, the attending doctor would be able to craft a case-based narrative to recommend Patient X to start Insulin Glargine to achieve glycemic control. Conversely, the attending doctor can select a patient, who has developed a complication, to present an adverse scenario to alert Patient X.