Model building and internal validation
Development cohort
The M@tric database is a large high-quality multicenter database, containing data from all adult medical, surgical and cardiac surgery patients admitted to three academic hospitals in Belgium (University Hospitals Antwerp, Ghent, and Leuven) [24]. Burn patients were not included in the M@tric database. The M@tric contains detailed de-identified data on 34380 ICU admissions (status November 4, 2019), collected during their ICU stay, including baseline data, comorbidities, laboratory values, high-frequency monitoring data, medication, procedures, diagnostic tests, and calculated scores. All patients included in the M@tric between January 2013 and December 2015, aged ≥18 years and having at least one CrCl24h available, were screened for eligibility. Patients receiving renal replacement therapy (RRT) were excluded.
ARC definition and variable selection
CrCl24h was calculated using 24h timed urinary volume (UV, mL) collected over one complete ICU day (7AM-7AM, 1440 min), mean urinary creatinine concentration (UCr, mg/dL), mean serum creatinine concentration (SCr, mg/dL) over this ICU day and corrected for an average body surface area (CrCl24h = (UCr UV) / (SCr 1440) (1.73 / (0.007184 height (cm)0.725 weight (kg)0.425)). ICU days on which the necessary data to calculate a CrCrl24h on the next day were not available, were excluded from the final development cohort.
ARC was defined as a CrCl24h ≥130 ml/min/1.73m², in accordance with the consensus for ARC in current literature [1, 3]. The predictors used in the development of the ARC predictor were selected based on current literature [1-4], expert consensus and data availability. The predictors selected were day from ICU admission, age, sex, SCr, urinary output, vasopressor use, mechanically assisted ventilation, comorbidities, trauma, neurotrauma, surgery, cardiac surgery and sepsis (detailed description in Additional file 1: Table S1).
Model development
The development cohort was divided at random, at ICU day level, in a training (80%) and an internal validation set (20%). Development of the ARC predictor was performed by applying a generalized estimating equation (GEE) logistic regression analysis with ARC on the next day as outcome (with ICU stay as clustering variable), with backward feature selection on the training set (Additional file 2: detailed model development) [25]. At each step, decision curve analysis (DCA) was performed to evaluate the model net benefit in the internal validation set. Net benefit is the number of true positives identified by a prediction model corrected for false positives. Net benefit should be larger than for the alternative strategies (i.e. ‘all ARC’ meaning “assume all days show ARC”, or ‘none ARC’ meaning “assume none of the days show ARC”) over a range of threshold probabilities that would be used in clinical practice. A threshold probability is the predicted probability above which a patient would be classified as showing ARC on the next day [26-28].
Performance of the model was subsequently assessed in the internal validation set. The receiver-operating characteristics (ROC) curves and area under the ROC curve (AUROC) was used for discrimination. Calibration was assessed using calibration plots [26]. Furthermore, sensitivity, specificity, negative predictive value, positive predictive value, negative likelihood ratio and positive likelihood ratio were calculated. The Youden index [29] was estimated to determine the threshold probability for which sensitivity and specificity are maximized. If DCA showed net benefit at this threshold probability, this was used as default threshold probability for further assessment of performance. For all performance parameters bootstrap 95% confidence intervals were calculated.
External validation
Validation cohort
For external validation, a single-center retrospective study was performed. All adult patients admitted to the ICUs of the University Hospitals Leuven, Belgium, between January 2016 and December 2016 were screened for eligibility. The same inclusion and exclusion criteria as described above for the development cohort were applied.
The data needed to calculate CrCl24h, and the predictors retained in the ARC predictor were retrieved from the clinical patient data management system database (Metavision®; IMD-Soft®, Needham, MA, USA) from the University Hospitals Leuven, and were pseudonymized.
External validation and comparison with ARC score and ARCTIC score
Performance was assessed as described above for the internal validation set, at the same threshold probability.
To compare the ARC predictor with the ARC score, a subset from the validation cohort was selected for which the sequential organ failure assessment (SOFA) score was available, as this is needed to calculate the ARC score. As suggested by Akers et al. [30] and Barletta et al. [18], who evaluated the diagnostic accuracy of the ARC score, a cutoff of 7 or higher was considered as a positive prediction for ARC.
For comparison with the ARCTIC score, a subset from the validation cohort with trauma related diagnosis on admission was selected, as this score was developed in trauma patients. As suggested by the authors of the ARCTIC score, a cutoff of 6 or higher was considered as a positive prediction for ARC [18].
Ethical approval
M@tric data-collection has ethical committee (EC) approval from the University Hospitals Ghent, where the database is being hosted (PA 2009/006). Research on the M@tric database requires EC approval of the EC of one of the contributing centers which then independently acts as central EC.
Approval for the present study was obtained from the EC of the University Hospitals Leuven (S61364) for the use of the M@tric dataset, as well as the retrospective Leuven dataset. This research has been performed in accordance with the ethical standards laid down in the 1964 Declaration of Helsinki and its later amendments. The need for a written informed consent was waived.
Statistical analysis
All statistical analyses were performed in R software (R version 3.5; The R Foundation for Statistical Computing, Vienna, Austria). Two-sided significance level was set to 0.05. Continuous data were presented as median and interquartile range and categorical data were presented as count and percentage. Sample size was not deemed an issue as we anticipated a very large number of inclusions and a relatively high number of events. Therefore, we performed all statistical analyses as complete-case analyses.