This protocol closely follows the methods of Doyle et al. (29) and the PRISMA extension statement for the reporting of systematic reviews incorporating NMA (30). We have provided a completed PRISMA-P checklist in Additional file 1. Public and Patient Involvement (PPI) helped us to refine the focus of the research questions and PPI will continue to play a key role in the running of this study. We have reported PPI in the development of this protocol using the short form of the Guidance for Reporting Involvement of Patients and the Public 2 (GRIPP2-SF; 31), which is available in Additional file 2. If any amendments to this protocol are necessary during the review process, we will add details and justifications for these to the registration record and report these in the final systematic review results report.
Eligibility criteria
Study types
Eligible studies will be RCTs of depression interventions administered to patients who are currently or have been previously diagnosed with cancer. The interventions of interest, based on clinical guidelines for the management of depression among patients with cancer and/or chronic physical health problems (19, 20, 32, 33) are pharmacotherapy, psychotherapy, exercise, combination therapies, and collaborative care interventions, as well as CAM approaches. As an outcome measure, studies using any of these interventions should employ a validated depression scale or diagnostic interview able to report a (potential) change in depression or depressive symptoms from baseline or pre- to post-treatment. Psychological interventions that are not established psychotherapies and are not delivered by professionally trained therapists will be excluded from the study. As recommended by Chaimani et al. (34), additional unspecified interventions that surface during the search process may be considered for inclusion in the network if the study meets the eligibility criteria and the inclusion of the intervention could serve to supplement the analysis by, for example, increasing the precision of the results. Studies included will be published (in English) in peer-reviewed journals, review articles or RCT registries.
Participants
Participants will be 18 years of age and over, and have a current or previous diagnosis of any cancer, and be at any stage of treatment (pre-treatment, active treatment, or post-treatment). Participants must be enrolled in an RCT targeting elevated depression as either a primary or secondary outcome, assessed using validated measures of depression symptoms at baseline and post intervention. We will exclude participants if they have (a) a diagnosis other than cancer, (b) antenatal/postnatal depression, (c) bipolar disorder or psychotic depression, or concurrent secondary psychiatric diagnoses. If some, but not all, of a study's participants are eligible for inclusion (e.g., if they include patients with cancer and patients with other diseases), then we will request the data for eligible patients only from the authors or, if >80% of participants are eligible for the review, we will include the overall trial estimates.
Intervention types
We will include the following types of interventions; however, as recommended, unspecified interventions may also be included post hoc to improve the precision of the model (34):
Pharmacotherapy
Interventions in this category will comprise of any medicines used to treat the symptoms of depression. In assessing eligibility we will draw on clinical guidelines (e.g., 19, 20, 32), being mindful of changes in recommended treatments over time. Examples include selective serotonin re-uptake inhibitors (SSRIs), serotonin-norepinephrine reuptake inhibitors (SNRIs), tricyclic anti-depressants, monoamine oxidase inhibitors, mirtazapine, agomelatine, etc. We will only include studies that randomised participants to pharmacotherapies within their licensed dose range (35).
Psychotherapy
We will adopt the inclusion criteria used by Cuijpers et al. (36) for the development of their complete database of trials on psychological treatments for depression (www.evidencebasedpsychotherapies.org). These are based on the definition of psychotherapy by Norcross (37): “Psychotherapy is the informed and intentional application of clinical methods and interpersonal stances derived from established psychological principles for the purpose of assisting people to modify their behaviours, cognitions, emotions, and/or other personal characteristics in directions that the participants deem desirable”. Eligible interventions can be delivered by any therapist (including psychologists, nurses, social workers etc.) so long as they are trained to deliver the therapy and in any treatment format so long as they are facilitated (i.e., individual, group, telephone, guided self-help or couple therapy). As outlined in Cuijpers et al. (36), these fall into the following eight psychotherapy categories: (1) cognitive behaviour therapy (CBT), (2) behavioural activation therapy, (3) problem-solving therapy, (4) interpersonal psychotherapy, (5) psychodynamic therapy, (6) non-directive therapy, (7) third-wave therapies (e.g., acceptance and commitment therapy, mindfulness-based stress reduction, mindfulness-based cognitive therapy) and (8) life review therapy (for definitions and examples, see https://evidencebasedpsychotherapies.shinyapps.io/metapsy/_w_ed60cf71/variable_description.pdf).
Combination therapy
Interventions in this category will involve both psychotherapy and pharmacotherapy components.
Exercise interventions
We will include interventions that involve aerobic and/or resistance training exercise. Other types of exercise, such as yoga or Tai Chi, which are not delivered based on the prescription principles for exercise training, will not be included in this intervention group (33, 38, 39).
Collaborative care interventions
Interventions in this category will involve a multi-component approach with active collaboration and enhanced inter-professional communication between different specialists and primary care providers (40).
Complementary and alternative medicine (CAM)
CAM interventions involve therapeutic approaches that are not usually included in conventional Western medicine (41), and they are used by patients in combination with or as alternative treatments for depression (42). In line with the National Centre for Complementary and Integrative Health and van der Watt et al. (43), we will include approaches covered in the following classifications: herbal interventions, nutritional supplements (e.g., vitamins or probiotics) and aromatherapy; cognitive interventions (e.g., hypnotherapy, imagery, and meditation); and physical interventions (e.g., tai chi, acupuncture and light therapy).
Comparison groups
To qualify for inclusion, RCTs must compare interventions with another appropriate comparator group such as treatment as usual (TAU), enhanced usual care, pill placebo control groups, no treatment, waitlist, attention control groups, or another depression treatment. Comparator groups will be considered separately as previous work has shown that these are not equivalent (27, 44, 45). The nature of the control groups used can, for example, have a major influence on the results by impacting on risk of bias (such as attrition rates and blinding), heterogeneity, and effect sizes observed (44, 45). With this in mind, we will categorise comparators into the following three groups in line with recommendations (46) and previous research (27, 29):
1) Pill placebo (for drug trials);
2) No treatment, waitlist or treatment as usual;
3) Treatment control (defined as minimal treatment control, active comparator, and specific and non-specific factors treatment control).
We will contact authors for further information in instances where comparator groups are unclear and, if necessary, include an ‘unclear’ comparator group or exclude a study from the NMA if details on comparator groups are not available.
Figure 1 below shows a sample network plot, based on all of the possible depression interventions and comparison groups we plan to include.
Outcomes
Primary outcomes
We will include two primary outcomes, following the example of Doyle et al. (29) and Cipriani et al. (25):
1. Efficacy: depression (means and standard deviations [SDs]) measured using validated tools (diagnostic interviews or screening instruments), and summarized with standardized mean difference (SMD) from baseline to post-intervention. The follow-up measure closest to 8 weeks will be used; however, measures within a range of 4 – 12 weeks will be accepted. If multiple depression measures were included in a given study, scores on the Hamilton Depression rating scale (HAM-D) will be the used; if the HAM-D is not included then preference will be given to longer scales with better content validity (35, 47).
2. Acceptability: the percentage of participants who discontinue the intervention/comparator for any reason, at any stage.
Secondary outcomes
The secondary outcomes of interest are:
1. Longer-term follow-up efficacy: depression (means and SDs) measured using validated tools (diagnostic interviews or screening instruments), and summarized with SMD from baseline to follow-up (the measure closest to 26 weeks available will be used, between 20 and 30 weeks)
2. HRQoL: HRQoL scores (means and SDs) on physical, social and emotional domains summarized using SMD from baseline to post-intervention. As for the primary depression outcome, the measure closest to 8 weeks will be used with an acceptable range from 4 – 12 weeks. Generic QoL scores will be used when HRQoL scores are not available.
3. Adverse events: the percentage of participants who leave the study as a result of intervention-related adverse effects within 12 weeks of study commencement.
4. Mortality (all-cause): the percentage of participants who die after or during the treatment (cancer related or otherwise) for the longest duration of follow-up.
Search strategy and study selection
As numerous systematic reviews exist on this topic (e.g., 16, 21, 22, 48), we will carry out a hybrid review of reviews and systematic review methodology (27). This approach involves first searching for relevant systematic reviews and extracting the pertinent references from these, before performing a supplemental search for individual RCTs that were published more recently (e.g., within the last 5 years, depending on the dates of the systematic reviews). By drawing on the work of previous systematic reviewers, this approach is less resource intensive, saving time and effort while still covering the available literature (27, 49). We will use the following databases to search from inception for reviews and meta-analyses: Cochrane Library, CINAHL, MEDLINE/PubMed, MEDLINE In-Process, EMBASE, and PsychINFO. We will extract RCTs and their associated data from the collected reviews and original studies. In addition, we will perform an updated search for relevant RCTs. We anticipate that the time-period for these searches will be within the last 3 - 5 years; however, the range will be determined by how recent the available systematic reviews are. We will use the databases MEDLINE/PubMed, the Cochrane Library and clinical trials registries (50) for RCT searches. The clinical trial registries we will include are the World Health Organization International Clinical Trials Registry Platform (WHO ICTRP) and clinicaltrials.gov. Furthermore, we will also search the reference lists of all included RCTs. Searches will adopt the BMJ trials and SR filters (available at: https://bestpractice.bmj.com/info/toolkit/learn-ebm/study-design-search-filters/). Although searches will not be filtered by language, only English language articles will be included. We have provided a sample search strategy in Additional file 3. We will download references into the reference manager software Endnote, and remove duplicate references via the software tools. Two reviewers will independently select reviews and trials and review full-texts for inclusion, discussing disagreements with a third reviewer.
Data extraction
Data extraction will be completed using structured data extraction spreadsheets in Excel to obtain all relevant data in a consistent fashion. Double data entry will be carried out, whereby data will be inputted independently by two reviewers into two separate datasets and then compared. Details extracted from the data will include study characteristics (first author, year of publication, journal, setting, and country), participant characteristics (sample size, mean age, % female, type of cancer, cancer stage [i.e., early-stage disease vs. advanced stage/palliative care], time since diagnosis, cancer treatment stage [i.e., pre-treatment, active treatment, post-treatment], depression inclusion criteria, baseline depression severity, depression assessment tools, presence of premorbid depression), and intervention and comparator group details (type of pharmacotherapy [name, dose, duration]; for psychotherapy interventions the Template for Intervention Description and Replication checklist [TIDiER] (51) will be used for extraction headings [i.e. intervention name, rationale/theory, materials, procedures, who delivered the intervention, delivery mode, location/setting, dose/intensity, tailoring, modifications, fidelity). Since psychotherapies can take many modalities and are delivered in different forms, the TIDiER headings will allow more precise documentation of any significant disparities among the selected psychotherapeutic treatments selected for study. Data extracted from the original RCT reports, including multi-arm trials, will be used to calculate summary effect sizes.
Continuous outcomes
For continuous outcomes, we will extract SMDs (when reported), 95% confidence intervals, means, SDs and number of patients participating in trial arm of the study into the final dataset. If this information is not available, we will request these data from the RCT authors. If data are omitted in the reports (e.g., SDs not reported), we will impute using the Cochrane recommended techniques for estimating SDs (52) and the metaeff command procedure in Stata to calculate SMDs and 95% confidence intervals from available data (53). If mean change scores are not reported, then we will consider outcome scores (27). If trials report the percentage of participants who no longer have elevated depression following the intervention, rather than mean change scores or outcome scores, we will also use the metaeff command to calculate SMD. If sufficient data are not available to calculate the SMD, we will include the study for descriptive purposes only and exclude it from the main NMA. If insufficient data are available to calculate the 95% confidence intervals, we will consider imputing based on the median from the other studies from that particular group (27, 29). We will carry out sensitivity analyses to determine whether there are implications of such imputations (27).
Binary outcomes
For the extraction of binary outcomes (i.e., acceptability, mortality, adverse events), we will obtain the number of participants with each event from each trial arm. When data are not available, we will contact the authors of the studies to request the information.
Duration of RCTs and outcome assessments
Following previous methods (25, 27, 29, 35) we will adopt an 8-week threshold for the synthesis of the primary depression outcome and the secondary QoL outcome. In those cases where data are unavailable for that duration, the closest available data from 4 to 12 weeks will be used (25). We will use overall dropout rate, regardless of time-point as the second primary outcome for acceptability. We will use long-term depression assessments at 26 weeks (with an acceptable range between 20 and 30 weeks) as a secondary outcome.
Missing RCT outcome data and units of analysis
We will extract all data as they were reported in the original trials, regardless of how missing data were dealt with in each study. As part of our risk assessment, we will rate whether or not the handling of missing was appropriate or not, where possible drawing this information from previous systematic reviews (27, 54). In line with previous NMAs (25, 29, 35), we will extract pertinent data to explain clustering from cluster randomised trials, and extract only data relating to the first study period from cross-over trials to prevent carry-over effects.
Risk of bias and quality of evidence
Two reviewers will independently extract and assess risk of bias (RoB) data, which will be used to inform updated RoB 2 tool (55) ratings. Of note, our use of the RoB 2 tool is likely to lead to fewer studies being classed as having a high RoB than existing systematic reviews that used the Cochrane tool (52), in particular for trials for which it was not possible to blind for treatment assignment. This is likely because the RoB 2 involves a more nuanced decision-making framework that does not automatically consider unblinded studies to be at high risk of bias (55). For example, for trials in which blinding was not feasible or implemented, the RoB 2 considers whether post-randomisation deviations from the intervention led to bias or whether the reasons for missing outcome data contributed to bias; if not, such trials can still meet the criteria for low RoB. If data to assess risk of bias are insufficient or missing, we will consider contacting RCT authors to obtain additional information. In instances when the two independent reviewers disagree on RoB ratings, input from a third reviewer will help to settle disagreements.
We will use the Confidence in Network Meta-Analysis (CINeMA) framework to evaluate the credibility of our results (56) and present the resulting information in a summary table. CINeMA, is based in part on the GRADE (Grading of Recommendations Assessment, Development and Evaluation) framework (57), but has been specifically developed to account for the complexity of NMA methods. The framework considers six domains: (i) within-study bias, (ii) reporting bias, (iii) indirectness, (iv) imprecision, (v) heterogeneity, and (vi) incoherence; and, as with GRADE, assessments of each domain are summarised to reflect whether confidence for the treatment effect is very low, low, moderate, or high (56). We will use CINeMA to summarise the strength of the evidence for each of the primary outcomes from the network estimates.
Transitivity assessment
Transitivity relates to how effect modifiers are distributed across intervention comparisons (23). To uphold this key assumption of NMA, studies making different direct comparisons must be sufficiently similar in all respects other than the intervention that is being examined (58), such that it is valid to make indirect comparisons between intervention groups that are connected via one or more intermediate comparator groups (34). Previous NMAs on the efficacy of antidepressants have identified that factors such as bipolarity, psychotic features and subthreshold depression can moderate the efficacy of antidepressants and therefore, to uphold the transitivity assumption of the network, have limited samples to non-psychotic patients with unipolar depression (35). We will take a similar approach and, in line with Doyle et al. (29), exclude studies where 20% or more of participants have bipolar or psychotic depression, or concurrent secondary psychiatric diagnoses. Another factor that may moderate the efficacy of interventions for depression is whether participants enrolled in an RCT have elevated depressive symptoms (i.e., score above a specified threshold on a validated measure of depression) at baseline. Including trails that do not specify elevated depression as an inclusion criteria may misrepresent the efficacy of interventions, because participants who have sub-threshold levels of depression to begin with have less scope to improve on this outcome (21). Indeed, previous studies have demonstrated that baseline severity of depression moderates the efficacy of psychosocial treatments for patients with cancer, such that effects are negligible when baseline depression is low (59). Therefore, we have specified that only RCTs that specifically enrolled participants with elevated depressive symptoms meet our inclusion criteria. Given these precautions, we assume that participants who fulfil the inclusion criteria for this protocol are equally eligible to be randomized to any of the intervention groups. Nevertheless, to assess transitivity and further guard against violating this assumption, we will compile a list of potential effect modifiers from data collected (e.g., participant age, sex, cancer type, treatment stage, time post-diagnosis, and cancer stage, level of depressive symptoms at baseline, the presence of other comorbidities), and investigate whether the distribution of these variables is similar across the studies included in pairwise comparisons (see Assessment of heterogeneity and inconsistency section for further details) (34).
Statistical analysis
We will use Stata 15 to carry out all quantitative analysis.
Study and network characteristics
We will present study characteristics and descriptive statistics on important clinical and methodological variables for all included RCTs (e.g., publication year, age, sex breakdown, settings, cancer type and severity, stage of treatment, etc.). We will generate network diagrams to illustrate the amount of evidence for each outcome and the properties of the RCTs contributing to each outcome (34). The size of each node in the diagrams will represent the number of participants in the intervention/comparator group, the edge width of the node will represent the number of RCTs involving a given intervention/comparator group, while lines connecting the nodes will signify the intervention/comparison groups that have been directly compared in the available RCTs (23).
Pairwise meta-analysis
We will perform random effects pairwise meta-analyses when head-to-head data are available. This will allow us to examine whether study characteristics are comparable across the RCTs that inform each direct comparison (see Heterogeneity and inconsistency assessments section), explore the impact of any potential outliers (60), and identify differences in estimated effects from the NMA that may be due to correlations between outcomes (29). For each pairwise analyses, we will report SMD or odds ratios, both with associated 95% confidence intervals, for continuous and binary outcomes respectively (61, 62).
Heterogeneity and inconsistency assessments
We will explore the impact of effect modifiers within and across comparisons, that may lead to heterogeneity and inconsistency respectfully, using both local and global measures (34). Specifically, we will use the design-by-treatment interaction model to assess inconsistency in networks as a whole (58). If we find evidence of inconsistency, we will contrast direct evidence with indirect evidence from specific loops (loop-specific) and from the entire network (node-specific) to detect pairwise comparisons or loops of evidence that may be introducing inconsistency in the network locally (34, 58). We will graphically present effect sizes using Forest plots to explore the possibility of statistical heterogeneity. Furthermore, we will quantify statistical heterogeneity and statistical inconsistency for each pairwise meta-analysis using I2.
Network meta-analysis
We will carry out a frequentist random-effects multivariate network meta-analysis to synthesize all evidence for each outcome and obtain a comprehensive ranking of all intervention groups for the primary outcomes. To this end, we will use the commands network meta and mvmeta (which underpins the first command) in Stata 15 (61). These commands use a Newton-Raphson procedure to carry out a restricted maximum likelihood (REML) multivariate meta-analysis, which accounts for within- and between-study correlations. Our analysis will include all available interventions types and comparator groups, as described and grouped above (i.e., pharmacology interventions, psychotherapy interventions etc.). If sufficient data are available, we will perform a second analysis that separates the various groupings by subtype (e.g., type of psychotherapy). We will use Rankograms and surface under the cumulative ranking (SUCRA) curves to rank intervention groups and visually present the uncertainty in ranking probabilities (34). An intervention/intervention groups’ SUCRA value corresponds to the ratio of the area under the cumulative ranking curve to the entire area in the plot. As such, it refers to the percentage of effectiveness/acceptability of an intervention/intervention group relative to a hypothetical intervention that would be rated the best without any uncertainty (23).
Sensitivity analysis
We will perform sensitivity analyses to examine the robustness of our findings with regard to the primary outcomes by carrying out subgroup analysis for the following, provided sufficient data are available:
1) studies with different levels of bias (i.e., low, of some concern, high);
2) studies of patients who meet the criteria for depression on a diagnostic interview and patients who score above the cut-off on a validated measure of depression;
3) studies of patient groups at different stages of treatment (i.e., patients who are pre-treatment, on active treatment, or post-treatment);
4) Studies of patients with advanced/incurable cancers or receiving palliative care (i.e., advanced cancer vs. early-stage disease);
5) Studies of patients with different cancer types (e.g., breast cancer vs. other cancers);
6) Studies that include patients with a premorbid history of depression and. those who do not.
Bias assessments
We will consider the likelihood that studies were conducted and not published and the comprehensiveness of our search strategy in evaluating the possibility of publication bias. We will use Funnel plots and Egger’s test to evaluate publication bias and the influence of smaller studies in pair-wise comparisons (34, 58). Furthermore, we will assess asymmetry on comparison-adjusted funnel plots, to examine possible associations between study size and study effect (23). Comparison-adjusted funnel plots graph the inverted standard error of the effect size on one axis to an adjusted effect size, which comprises the difference between a study estimate and their direct meta-analysis mean effect, and can be used to examine whether results vary depending on trial precision in NMAs (34, 58). Finally, we will carry out a network-meta regression to determine associations between study size and effect size (63).