The registration of this systematic review and meta-analysis has been approved in the International Prospective Register of Systematic Reviews (PROSPERO) with the protocol number CRD42019122638. We followed the PICOS strategy [16] and the Preferred Reporting Items for Systematic Reviews and Meta-Analyses: The PRISMA Statement [17]. The recommendations of the Cochrane Collaboration [18] were used as a complementary guide.
2.1 Eligibility criteria
The acronym PICOS was primarily established to retrieve studies, as follow: depressive patients (P - participants), exercise (I - intervention), control group (C – comparison), depressive symptoms amelioration (O – outcome) and randomized controlled trials (RCTs) (S – study design).
2.1.1 Types of studies
RCTs conducted between 2003-2019 were selected, whether exercise was beneficial or not in the treatment of depression. To reduce the risk of bias, conference proceedings and unpublished studies were not used [19], and no restrictions on the language of the studies were applied.
2.1.2 Participants
Males and females aged 18 and over (with no upper age limit) with clinical depression as defined by the Diagnostic or Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV), International Statistical Classification of Diseases and Related Health Problems 10th Revision (ICD-10) criteria or other validated depression scale. Depression encompassing a co-morbid illness such as diabetes (insulin-dependent), cancer, multiple sclerosis, cardiovascular disease and mixed psychiatric diagnosis was not included.
2.1.3 Interventions
Intervention group had as ET aerobic or resistance training programs or a combination of both, in addition to PT, or not. In control group, we analysed interventions, such as pharmacotherapy, psychotherapy, another exercise program or no intervention (placebo or wait-list control). Considering exercise parameters, we included studies in accordance with the following data criteria: minutes per session, duration of intervention (week), frequency (days per week), adherence (percentage of finishers ) and the amount of energy expended and measured in kilocalories per week (Kcal/week), or the percentage of maximum heart rate (%HRmax), or the percentage of heart rate reserve (%HRR), or the percentage of maximal oxygen uptake (VO2max), or the percentage of maximal oxygen uptake reserve (VO2reseve), or metabolic equivalent (METs).
2.1.4 Outcomes
Pre and post-intervention scores of validated depression-rating scales were analysed. Studies that measured outcomes immediately before and after a single exercise session were not included.
2.1.5 Information sources
We conducted searches on PubMed/MedLine, Scopus, ISI Web of Knowledge, Cochrane Trials and APA PsycNET databases. No filters were used to search any of the databases to conduct the broadest search and reduce the risk of bias [19].
2.1.6 Search
Medical Subject Headings (MeSH) and search-indexed descriptors were used to refine data search [20]. Three thematic word groups with MeSH terms were used to conduct the searches. Within each group, the terms were combined using the Boolean operator OR and interaction between sets using the operator AND to form a phrase. Searches were conducted in April 2019 using the following terms: (“depressive disorder” or “unipolar depression” or “major depressive disorder”) AND (“exercise” or “exercise programs”) AND (“clinical trial” or “randomized controlled trial”).
2.1.7 Study selection and data collection process
Studies were screened and data independently extracted by two researchers. Another researcher was requested to confirm the eligibility of the identified studies. Identified studies were tabulated on a worksheet (Microsoft Corporation, Redmond, USA) to confirm if they met the eligibility criteria. All their titles and abstracts were screened and the full-text articles were assessed for potential inclusion by the main investigator. Studies were included in the qualitative synthesis after the exclusion of a comprehensive text review. We identified all studies that presented a high risk of bias for the finish –selection process and finally, we included all quality studies into a quantitative synthesis to perform the meta-analysis.
2.1.8 Appraisal of methodological quality
The quality of the selected studies was appraised using the Delphi-list [21]. We analysed randomized controlled trials (RCT) through: randomization, allocation concealment, baseline comparability, eligibility criteria, blinding, descriptive measures for the primary outcome and intention-to-treat analysis [22]. In this study, except double-blinding, which is not applicable within the framework of trials involving physical exercises, all these features were taken into account by the qualifying examination. Two researchers independently calculated an overall quality score of the Delphi items that scored positive and discussed them to achieve consensus. Studies selected did exhibit weaknesses concerning some criteria and these deficiencies were taken into consideration and explained in the results and discussion sections.
2.2 Data selection
We selected the following characteristics in all studies: 1) total number and age of each group; 2) depression-rating scales used to the diagnosis; 3) type of intervention and other exercise parameters mentioned in the section 2.1.3; 4) pre (M1) and post (M2) intervention depression-rating scales scores (n, means and standard deviation) to calculate the effect size (ES) of intervention and control groups. The mean difference (MD = M2-M1) and the pooled standard deviation (SDpooled = √ (SDM22 (nM2-1) + SDM12 (nM1-1)) / (nM2 + nM1 – 2)) of each group in all studies were calculated to ES analyses [23]. Regarding pre-selected articles that did not present the necessary data in the text, values were requested to the authors by e-mail.
2.3 Risk of bias
The risk of bias was assessed using qualitative analysis for each included study, and each risk of bias item presented on the Delphi-list. This scale provides a quality assessment of RCT studies, and the high quality is defined as achieving over 50% of the maximum attainable score, meaning five or more criteria met on the Delphi-list [21]. To analyse the risk of publication bias, we used funnel plot visual inspection. The risk among studies was assessed using the results of heterogeneity within the forest plot. Heterogeneity was measured using the T2, X2, and I2 tests. In the T2 test, T2 > 1 suggests the presence of substantial statistical heterogeneity. If the X2 value is statistically significant (p < 0.05), there is also evidence of heterogeneity. In the I2 test analysis, the percentage of the variance attributed to the heterogeneity of the study ranges from low (25% < I2 < 50%) to moderate (50% < I2 < 75%) to high (I2 > 75%) [19].
2.4 Summary measures
Analyses were performed considering two different groups, experimental and control. The main analysis was related to the additional effect of ET with or without PT for depression. Also, we included the effect of subgroup analyses among different exercise prescriptions, intervention, and control group characteristics encompassed in the selected studies. Control group data were replicated and compared with the different intervention groups in their studies to comparison analysis in this meta-analysis For multiple comparison groups in the same study, control group had the sample divided based on the number of groups that existed for comparison [24]. This was to maintain control group correct sample size and allocate the right weight to groups with more subjects.
Standardized mean difference (SMD) was calculated considering the mean difference on depression scales (pre and post-intervention) and pooled standard deviation for each intention-to-treat in each study. This outcome was reported on different validated scales. Consequently, the SMD in this review was calculated based on the random-effect model with 95% confidence intervals (CI: 95%), including the assumption of heterogeneity of the studies and their participants. Analyses of forest and funnel plots were performed using the Review Manager (RevMan) Version 5.3 software (Copenhagen: The Nordic Cochrane Centre, The Cochrane Collaboration, 2014).
2.5 The Gene Network Model in silico analyses
The GeneCards database (https://www.genecards.org) was used to explore proteins-related genes associated with exercise and depression.
Keywords “exercise” OR “physical activity” AND “major depression” were inserted into database to identify genes associated with the subject. GeneCards organizes genes search according to a qualitative score. A score above 10 is acceptable to indicate the gene is known to be functional (see https://www.genecards.org/Guide/GeneCard). To increase the power of the analysis, genes with score 30 or above were retrieved from GeneCards as follows: interleukin 6 (IL-6, score 42.60), tumour necrosis factor (TNF, score 42.44), solute carrier family 6 member 4 (SLC6A4, score 40.72), 5-hydroxytryptamine receptor 2A (HTR2A, score 37.06), tryptophan hydroxylase 2 (TPH2, score 36.12), insulin (INS, score 35.63), brain-derived neurotrophic factor (BDNF, score 32.05) and apolipoprotein E (APOE, score 30.53). Genes aforementioned were inserted into the String database (https://string-db.org), which permits exploring genes and their encoded protein interactions as a network. String uses data mining to find published articles which have investigated direct or indirectly targeted protein interactions in different species. According to the interactions found, we can suppose that exercise could result similar interactions, independently of the results from meta-analysis. Our interaction exploration was performed considering Homo Sapiens as the studied specie.