We registered the protocol with PROSPERO (CRD42019145054) and published it in a peer-reviewed journal [12]. Supplementary Table 1 (Additional File 1) details the differences between the protocol and the review. We reported this study according to the PRISMA statement [13].
Eligibility Criteria
We included studies (experimental or any observational design) that sought to confirm the independent prognostic effect of sex on mortality in critically ill adults with sepsis while controlling for covariates (called phase 2 studies) [14]. We included patients aged 16 years and older with a sepsis diagnosis, as defined by the study authors, treated in an ICU. Studies including both adult and paediatric patients were eligible if adults represented more than 80% of the study sample. Sex and gender are distinct concepts, though often erroneously interchanged in the medical research reports [15]. We accepted any assessment of sex as a biological characteristic, which, when applicable, we also appraised operational concepts of sex and gender provided by the study authors using the classification detailed in Supplementary Table 2 (Additional File 1) [16]. We pre-specified the following core set of adjustment factors: age, severity score [Sequential Organ Failure Assessment score (SOFA), Simplified Acute Physiology Score II (SAPS II) or Acute Physiologic Assessment and Chronic Health Evaluation II (APACHE II)], comorbidities (immunosuppression, pulmonary diseases, cancer, liver diseases, or alcohol dependence), non-urinary source of infection, and inappropriate or late antibiotic coverage. The primary outcomes were all-cause hospital mortality (the longest follow-up provided by the study authors) and 28-day all-cause hospital mortality. Secondary outcomes were 7-day all-cause hospital mortality, 1-year all-cause mortality, and all-cause ICU mortality.
Search Strategy And Selection Process
We searched MEDLINE Ovid, Embase Elsevier, and Web of Science for studies published any time up to 17th July 2020. We based the search string on terms related to the population (sepsis), the outcome (mortality), prognostic study methods [17], as well as the prognostic factor (sex) for which we used a search string adapted from previous studies [18–21]. We applied no language restrictions. We checked the bibliographic references of the key publications and the included studies for additional relevant studies. We also searched ClinicalTrials.gov and the World Health Organization International Clinical Trials Registry Platform for unpublished and ongoing studies. Furthermore, we handsearched conference proceedings from 2010 to 2019 of the foremost critical care and infectious diseases symposia (See Supplementary Table 3, Additional File 1).
We used the online software EPPI-Reviewer 4 to manage the study selection process [22]. Pairs of review authors (from AA, AVH, MP-A, OMP, RdC, PF) independently screened the title and abstracts, and when appropriate, full-texts to determine their eligibility. We used a consensus method and consulted a third author (from AM, BF-F, JL-A, JZ) if disagreement remained.
Data Collection And Risk Of Bias Assessment
Two authors (from AA, ES, OMP) independently extracted data and reached consensus using electronic extraction templates in EPPI-Reviewer 4. We used the CHARMS-PF (checklist for critical appraisal and data extraction for systematic reviews of prediction modelling studies for prognostic factors) guidance for data collection [23]. For each included study, we extracted the following data: general study characteristics, participant characteristics, sepsis definition, prognostic factor, outcomes assessed, missing data, analysis, and all unadjusted and adjusted estimates of the association between the prognostic factor and each review outcome, with details on any covariate used for adjusted ones. If a study provided several estimates for the same outcome, each of them adjusted for different covariates, we extracted the estimate adjusted for the maximum number of covariates from the core of adjustment factors. We contacted the study authors for clarification. Two authors (from AA, ES, OMP) independently assessed the risk of bias of the included studies, agreed on ratings, and a third author (JL-A) participated when required. We applied an outcome-level approach and amended the QUIPS (quality in prognosis studies) tool using four categories (low, moderate, high, or unclear risk) [23–25]. We defined studies controlling for less than three of the aforementioned covariates as “minimally adjusted for other prognostic factors or moderate risk”, and those controlling for at least three of these covariates as “adequately adjusted or low risk of bias” for the QUIPS adjustment domain [26]. We assessed selective reporting bias by: 1) searching for a prospective study protocol or registration; 2) dealing with related conference abstracts; and 3) carefully examining the study methods section [24].
Data Synthesis
For each study and prognostic factor estimate, we extracted the measures of associations alongside its confidence intervals (CIs). We transformed association measures into an odds ratio (OR) with its 95% CIs to allow statistical pooling whenever adequate [27]. We estimated no data from Kaplan-Meier curves because of the risk of overestimation of events and censorship concerns [28]. We presented results consistently, so associations above one indicated a higher mortality for female participants. We pooled estimates in meta-analyses when valid data were available. For the primary analyses, we used estimates from the model that adjusted for more covariates from the core of adjustment factors. We evaluated the censoring mechanisms assumed in the studies that have been analysed using time-to-event procedures (i.e., Cox proportional hazard models). We performed random-effects meta-analyses applying the Hartung-Knapp-Sidik-Jonkman (HKSJ) adjustment [29], using RevMan 5.3 (The Cochrane Collaboration, Copenhagen, Denmark) and the template for conversion provided by IntHout [30].
We examined heterogeneity computing prediction intervals when the meta-analysis contained at least three studies [29, 31]. We planned to undertake subgroup analyses based on study design characteristics: cohort studies versus case-control studies, and prospective studies versus retrospective studies. We foresaw to compare differences between subgroups by performing a test of interaction [32].
We conducted sensitivity analyses accounting for the risk of bias. We considered the following QUIPS domains as key domains for the analyses: study attrition, prognostic factor measurement, outcome measurement, and adjustment for other prognostic factors. Firstly, we planned to exclude studies with a high risk of bias in at least one key domain. Secondly, we excluded studies with either a high or moderate risk of bias in at least one key domain. In other sensitivity analyses, we foresaw to exclude studies that adjusted for a set of adjustment factors entirely different from ours. Additionally, we explored potential differences between meta-analyses based on unadjusted (crude) and adjusted estimates.
We planned to assess publication bias for each meta-analysis including ≥ 10 studies by funnel plot representation and Peter’s test at a 10% level [33].
Assessing The Certainty Of The Evidence
We assessed the certainty of the evidence using the GRADE (grading of recommendations assessment, development, and evaluation) approach and guidance for prognosis studies (See Supplementary Table 4, Additional File 1) [26, 34–39]. We summarised our results for each outcome in a “Summary of findings” table using the GRADEpro GDT software [37]. We described results for prognostic effect estimate considering the certainty of the evidence and its clinical importance (important effect, slight effect, and little or no effect). As we found no well-established clinically important thresholds for prognostic effects, we agreed on an absolute risk difference of at least ± 10‰ as clinically important difference.