The data of this study were taken from a total of 8884 inpatients who were admitted to the ICU of the First Medical Center of the People's Liberation Army General Hospital from January 2008 to June 2017. Of these patients, 1,128 patients died in the hospital. A total of 23,902 pieces of hospitalization medical information were collected from all hospitalization data. The drug codes were adapted to the hospital codes for a total of 2,570 items. The protocol was approved by the Hospital Committee on Ethics of the Chinese PLA hospital (S2020-141-01).
We found the patients who died in the hospital by screening patient outcome data, and the remaining patients were determined to be survivors. Then, we sorted all patients’ discharge times and the start and stop times of medical orders. We set the three days before the patients’ discharge time as the study time interval, and all the doctors’ advice pertaining to the drugs administered during that time were obtained, including the names, doses, dose units, and frequency of administration of the drugs. The total doses of each drug used during the three days before the discharge for each patient were summed.
3. Data analysis
1. Medication vectorization
The drugs administered to each patient are expressed in the following manner. We used the total doses of the administered drugs, which can be expressed as continuous variables, as the attribute values.
Patient ID|Drug Name & Route of Administration & Dose Unit|Total Dose Administered
Some examples are listed in the Table 1 below:
Table 1
Expressions of drugs administered to each patient
Patient ID
|
Drug Name & Route of Administration & Dose Unit
|
Attribute values
|
10286215
|
Dopamine injection & intravenous infusion & mg
|
400
|
10286215
|
Epinephrine hydrochloride injection & intravenous infusion & mg
|
10
|
10286215
|
Metaraminol bitartrate injection & intravenous infusion & mg
|
10
|
2. Classification and calculation
AdaBoosting
We used the Weka program for AdaBoosting calculations. The samples were used only for training, and the parameters took the default values of the Weka program.
Pearson correlation coefficient
The following formula was used to calculate the correlation coefficient:\(\)\(\text{r}=\frac{{\sum }_{i=1}^{n}({X}_{i}-\stackrel{-}{X})({Y}_{i}-\stackrel{-}{Y})}{\sqrt{{\sum }_{i=1}^{n}{({X}_{i}-\stackrel{-}{X})}^{2}}\sqrt{{\sum }_{i=1}^{n}{({Y}_{i}-\stackrel{-}{Y})}^{2}}}\)
The drug attributes of each group were combined, and we summed the attribute values in the same group to obtain the total vectors of the survival group and the death group. The Pearson correlation coefficients were calculated for each individual case vector and the survival group and death group vectors separately. If the Pearson correlation coefficient between the case vector and the survival group vector was greater than that of the death group, we judged the patient as a survivor, and if not, the patient was estimated to have died.
Observed to expected ratio-weighted cosine similarity
The observed to expected ratio was calculated by a four-cell table[5] showed on supplementary file.
$$\text{O}\text{b}\text{s}\text{e}\text{r}\text{v}\text{e}\text{d} \text{t}\text{o} \text{e}\text{x}\text{p}\text{e}\text{c}\text{t}\text{e}\text{d} \text{r}\text{a}\text{t}\text{i}\text{o}=\frac{a\times \left(a+b+c+d\right)}{\left(a+b\right)\times \left(a+c\right)}$$
In the survival and death groups, the drug attributes were combined, and the attribute values were added. The attribute values were multiplied by the observed to expected ratio as the vector value of two groups. The cosine similarities were calculated for each individual case vector and the survival group and the death group vectors separately. The formula is as follows:
$$cos= \frac{\sum {x}_{n}{y}_{n}}{\sqrt{\sum \left({x}_{n}^{2}\right)} \sqrt{\sum \left({y}_{n}^{2}\right)}}$$