Design and sample
The present work is a cohort study based on the previously described ambispective cohort (n=464) of patients hospitalized for COVID-19 in the hospitals of the Consorci Sanitari de l'Alt Penedès i Garraf (CSAPG) (12). The CSAPG includes three second-level hospitals with a total of 457 hospital beds, including seven intensive-care beds (extended to 24 beds at the peak of the epidemic) and 182 intermediate-care beds. Its territorial scope includes an area of Barcelona with a reference population of 247,357 inhabitants.
For this study, patients aged 80 or older who were admitted for respiratory infection associated with COVID-19 and with pharyngeal, nasal, or sputum smears positive for SARS-CoV-2 (real time-polymerase chain reaction [RT- PCR]) were included. All patients who were hospitalized through the emergency department were recruited from March 12 to May 2, 2020 and were followed until hospital discharge or death. Patients with a positive COVID-19 test but without clinical or radiological respiratory involvement and patients with compatible respiratory symptoms who were treated as COVID-19 patients during admission but with negative smears (“COVID-19 clinical”) were excluded. Also excluded were patients who, despite meeting the diagnostic inclusion criteria, were not admitted to a hospitalization unit (for example, due to death in the emergency department or transfer to a tertiary referral centre). In our case, there was no need to transfer patients to other centres of the same level for lack of hospital beds.
Patients were selected from the daily hospitalization census. This census included the medical diagnosis of admission of each patient and a signal that identified the patients who had requested an RT-PCR test for SARS-CoV-2.
A predetermined calculation of the sample size was not performed. We included all possible patients who met the admission criteria.
Variables and information collection
Information on the variables was collected from the computerized medical records (GoWin program, version 2.4.0). The interviewers (the COVID-19 research group of the CSAPG [30 people]) began the study on April 6 and continued until the discharge or death of the last patient recruited. The information was collected with the help of two data collection notebooks (the first for baseline assessment and the second for the follow-up) created with the Open Clinica programme, version 3.14 (Copyright © OpenClinica LLC and collaborators, Waltham, MA, USA). Training sessions for data collection were held by the coordinating researcher of the study, and the quality control process included the review of at least 20% of the data of the main variables of the study to verify their agreement with the source document. If necessary, retraining and supervision sessions were held.
In the baseline assessment, sociodemographic, comorbidity, previous pharmacological treatments, and clinical presentation were collected from the emergency assessment data. The data of comorbidity and previous pharmacological treatments were collected after reviewing all the medical reports available in the computerized clinical history. We recorded the data categorically (yes/no) from a predetermined list prepared by the researchers (Table 1).
The clinical presentation variables were collected from the emergency department medical report and included symptoms and signs (categorically recorded from a predetermined list), oxygen saturation, pulmonary radiological involvement (number of affected lung quadrants, range 0-4), and the level of C-reactive protein (CRP) (hereinafter “emergency CRP”).
During each day of follow-up, the following variables were collected: hospital discharge, oxygenation system (nasal cannulas, mask, non-rebreather mask, noninvasive mechanical ventilation, orotracheal intubation), and death. For the present study, the laboratory parameters of the first day of hospitalization were also considered, which were extracted automatically by the Department of Informatics to avoid manual registration errors.
The variables considered potentially predictive were those collected in the baseline assessment and the laboratory parameters of the first day of hospitalization. The outcome variables were two: mortality and severe disease, which were verified every day of follow-up. Severe disease was defined as the need for oxygen therapy with a reservoir mask, mechanical ventilation (invasive or noninvasive), or high-flow nasal cannulas.
Regarding the predictor–outcome variable association, age and sex were considered a priori as potential confounding variables and/or effect modifiers of all other variables evaluated.
The data collection notebooks with the complete lists of variables are available in supplementary material 1 and 2.
Statistical analysis
For the analysis of the prognostic factors of death and severe disease, the potential predictor variables were grouped into four blocks: age–sex–comorbidity (block 1); previous pharmacological treatment (block 2); variables of clinical presentation, including pulmonary radiological involvement and CRP in the emergency room (block 3); and variables of laboratory parameters (block 4).
Within each block and for each outcome variable, a bivariate analysis was performed with each predictor variable (chi-squared or Fisher’s test for categorical variables, the T-test or Mann-Whitney test for quantitative variables), and a multivariate model was built using logistic regression, except for block 3 (in this block, it was considered more relevant to evaluate the individual predictive capacity of each parameter).
In the bivariate analysis and given the multiplicity of analyses performed, the statistical significance was adjusted by the false discovery rate (FDR) method (13).
In all the planned multivariate models, age and sex were included, given their status as potential confounding variables. The variables with significant associations (unadjusted p<0.05) found in the bivariate analysis were preselected for the models. As the primary objective of the identification of the prognostic factors with high associative strength, the least absolute shrinkage and selection operator (LASSO) method was used for the final selection of the variables to be included in the models. The LASSO method (14) is not based on p-values (which could induce the inclusion of superfluous clinical variables in the final model) but on a modification of the minimum quadratic estimation. Its objective is to select a smaller subset of explanatory variables (but with greater strength of association) with which to finally adjust the model without significantly losing any explanatory quality of the model. This procedure is considered superior to eliminating the predictor variables according to p-value. Variables with more than 30% missing values were excluded from the multivariate models, as were those with 15 or fewer individuals with the evaluated condition. Finally, based on the results of the bivariate analysis and to avoid collinearity, creatinine was excluded from the models when it coincided in the preselection with the urea variable.
Quantitative variables were not categorized. The laboratory parameters were transformed logarithmically to improve their fit to a normal distribution and were scaled to allow a comparison of their odds ratios (ORs).
Regarding the missing data, in case there were no laboratory parameters from the first day of hospitalization, these variables were imputed from their values of the second day of hospitalization if the latter were available. No missing data of other variables were imputed.
R version 3.6.1 (R Project for Statistical Computing) and IBM SPSS version 26 were used.
Ethical approval
The present study was approved by the Ethics Committee of Bellvitge Hospital (act 12/20, PR 252/20, date 25 June 2020), which approved the study without the need for the informed consent of the patients given the observational nature of the study and the anonymous nature of the data collected.