Conducted for the first time on Iranian patients, this study provides three practical prognostic models using invasive and non-invasive data from the first day of patients’ admission to predict the COVID-19 mortality. Furthermore, the prediction power of non-invasive and invasive feature groups was evaluated across the temporal and feature number spectrum to reveal interesting result. Although invasive features are good predictors for the imminent future, they are outperformed by non-invasive features for a more distant future. Moreover, compared with the invasive model, the non-invasive model could provide better performance in higher, equal feature dimensions.
Predicting the trajectory destination of COVID-19 could provide substantial support for decreasing mortality rates. Rapid disease transmission and high patient load could quickly overload healthcare infrastructures; an overloaded medical system can result in higher mortality rates due to inefficient management of limited medical resources and personnel; this issue was highlighted by a study indicating that 30% of Chinese COVID-19 patients died without receiving ventilator support [4]. Furthermore, strict preventive measures, social isolation, and the distress caused by the diagnosis of the disease could lead to the activation of psychological defensive behaviors in patients where they underestimate their symptoms and do not seek immediate medical assistance [15]. This optimistic bias could be fatal if the condition of a patient suddenly worsens towards a critical stage. Similar to an early warning system, our models could amend these issues by providing unbiased, rapid prognosis prediction to support proper resource allocation and decision making.
Time is an important element in the fight against COVID-19. The disease has an unpredictable trajectory where the condition of some patients suddenly becomes critical [5], surprising even the most skilled physicians; this hampers physicians’ performance by limiting their action time window. Furthermore, it has been shown that patients who later become critically ill carry significantly more viral loads even before their condition becomes critical [16]. Thus, rapid isolation of high-risk patients is required to decrease infection spread. Our models could alleviate these problems; by providing a prognosis prediction after the first day of patient’s admission, the time window of clinical actions (e.g., requesting additional ventilators, registering patients on clinical trials) could be significantly increased, potentially decreasing mortality rates by enabling a more thoughtful approach by physicians.
We developed three predictive models using invasive features, non-invasive features, and both. Our joint model provides rapid, accurate predictions using features that are routinely collected upon patient admission, making it implementable even in conditions where imaging or sophisticated laboratory equipment is unavailable. Our results revealed that non-invasive features displayed good prediction capacity compared with the joint model (Figure 3, panel A). Furthermore, invasive features displayed more prediction accuracy for imminent deaths. In contrast, non-invasive features provided more prediction power to predict deaths that were further from admission day (Figure 4, panel A). This difference in prediction range might stem from the fact that invasive and non-invasive biomarkers have distinct temporal dynamicity and information content. Many key laboratory features, such as LDH and PTT, have high temporal dynamicity; to maintain homeostasis, after an insulting event, these features tend to rise and then return to their normal range in a relatively short time [17]. Furthermore, laboratory feature abnormalities only show disruptions in body systems, which they are linked to, limiting their information content. The aforementioned factors limit the predictive temporal range of models, which are mainly based on laboratory biomarkers. Many non-invasive features, such as age or presence of comorbidity, can be seen as signals that contain a significant amount of compressed, less variable data. Physicians, by instinct and training, could decompress these signals to a certain degree. For example, the presence of diabetes is informative of persistent glucose metabolism, immune response, and vascular function distortions [18]; these distortions, to varying degrees, are continuously present even in patients with proper disease control.
Although laboratory features provide valuable prognostic information, their analysis requires invasive sampling. Many patients are wary of blood sampling [19]. Moreover, high patient load and equipment shortage could hinder the availability and accuracy of blood testing [20]; many biomarkers, such as LDH and blood gas tests, require careful sampling, preservation, and transportation to avoid errors resulting from complications, such as sample hemolysis or clotting [21]. Besides, many smaller health centers do not have access to laboratory equipment. Lab tests are also generally expensive. A study from the United States indicates that, even in the absence of a pandemic state, over 20% of patient medical care was not needed [22]. These rates would most likely increase considerably in a pandemic state; physicians, faced with a novel disease and no coherent guidelines, will request more unnecessary blood tests, patient admissions, and referrals. These unnecessary cares will impose a significant financial burden on patients and healthcare systems. Rapid triage of patients is also a critical factor, required to manage high patient loads [23]. However, an important downside of routine rapid triage in a pandemic situation is the increased mortality rate due to missing high-risk patients [11]. These patients might incorrectly be identified as mild and, without further workup, be advised to take a home-treatment approach. Our model using non-invasive features could provide fast, accurate prognosis prediction to augment the initial triage and avoid missing high-risk patients. To facilitate this triage approach, vital signs could be easily measured by wearable medical devices [24], and history data could be easily asked from patients by predefined questions.
Five prominent features were highlighted by NCA analysis in this study; age, SPO2, LDH, AST, and PTT. Previous studies have shown that older age is positively associated with increased mortality in hospitalized COVID-19 patients [25]. Older age is associated with more infection susceptibility and an atypical response to viral pathogens due to reduced expression of type I interferon-beta [26]. Furthermore, age-related impairment of lymphocyte function along with abnormal expression of type 2 cytokines leads to prolonged pro-inflammatory responses; this weakens the host response to viral replication causing poor clinical outcomes and higher mortality [27]. In contrast to typical types of pneumonia, the initial phases of COVID-19 have little apparent symptoms, such as dyspnea. The cause is the fact that there is still carbon dioxide exchange through alveoli at these stages. However, the oxygen exchange is disturbed due to the alveolar collapse. This type of hypoxia, called “silent hypoxia,” leads to the progression of pneumonia in the absence of clinical symptoms [28]. It also causes a vicious cycle, where hypoxia promotes the activity of the local inflammatory system causing further damage and higher hypoxia [29]. therefore, SPO2 could be a decisive factor to uncover the pneumonia progression and the severity state of patients.
Pulse oximetry, via wearable devices or hospital equipment, could show decreased levels of SPO2, in blood; this is valuable for early detection of the hypoxemia. Elevated levels of LDH could reflect tissue injury caused by SARS-CoV-2 and concurrent lung fibrosis. Indeed, abnormal LDH is commonly seen in idiopathic lung fibrosis [8][30]. Furthermore, a robust immune response to SARS-CoV-2 infection and subsequent cytokine storm could cause multi-organ damage, which causes a further rise in the LDH level [31]. Another organ which is affected by the cytokine storm is the liver, even though, it is not a primary target for SARS-COV-2. As a result, abnormal levels of liver function biomarkers such as AST could be a manifestation of severe disease and poor outcomes [31]. The inflammatory response promoted by severe SARS-CoV-2 infection could cause endothelial damage, distortion of the coagulation cascade function, and coagulopathy. Therefore, levels of PTT, a coagulation biomarker, during COVID-19 infection can be informative of coagulopathy progression and disease severity [32].
Limitations
The results of this study should be interpreted in light of several limitations. This study was carried out within a retrospective framework. Consequently, supervision was not possible to increase the quality of data documentation when patients were admitted. Furthermore, the analysis interval of this study encompassed the first disease wave. Thus, medical records were documented in haste as high patient loads and limited medical staff forced the medical system to prioritize patient treatment. Researchers were not blind to outcomes. No external validation data was utilized due to limits imposed by the pandemic state of hospitals and preventive regimes. The Massih Daneshvari Hospital had more severe and expired patients since it was a primary care center for COVID-19. Finally, qualitative CRP, a feature reported by several studies to be associated with disease severity, was removed from the analysis due to high missing values due to limited laboratory resources and incomplete medical records caused by the pandemic. However, with the presence of other acute phase reactants and inflammatory markers, such as ESR, platelet number, and LDH, in this model, it is likely that a significant portion of variance explained by CRP was compensated by these features.
Future works
To increase speed and convenience, imaging features were not utilized in this study. Future works can compare the prediction power of imaging features with laboratory and non-invasive features. This study, conducted as a pilot study, was not externally validated. Future studies could include data from other hospitals for external validation. Future projects can expand the practicality of our study by devising prognosis prediction software on various platforms. In this study, binary outcome (i.e., discharged and expired) was used as outcome. Future projects can focus on other outcomes, such as whether a patient was intubated or admitted to ICU as outcomes. To devise specific prognostic models, Future studies can focus on individual groups of comorbidities (e.g., cardiovascular) to develop separate models. Finally, continuous data input from various hospitals could be used to develop and incrementally train an online learning model to predict the prognosis of COVID-19 patients, giving increasingly precise and updated results to be used in clinical and non-clinical settings.