Open-label placebos: A systematic review and meta-analysis of experimental studies with non-clinical samples

doi:10.21203/rs.3.rs-2093533/v1

Download PDF

Article

Open-label placebos: A systematic review and meta-analysis of experimental studies with non-clinical samples

https://doi.org/10.21203/rs.3.rs-2093533/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 04 Mar, 2023

Read the published version in Scientific Reports →

You are reading this latest preprint version

Background: The use of open-label placebos (OLPs) has been shown to be effective in clinical trials. We conducted a systematic review and meta-analysis to examine whether OLPs are effective in experimental studies with non-clinical populations.

Methods: We searched five electronic databases on April 15, 2021. We conducted separate analyses for self-reported and objective outcomes and examined whether the level of suggestiveness of the instructions influenced the effectiveness of OLPs.

Results: Of 3,573 identified records, 20 studies comprising 1,201 participants were included. We found a significant effect of OLPs for self-reported outcomes (SMD=0.43; 95% CI=0.28, 0.58; I²=7.2%) but not for objective outcomes (SMD=-0.02; 95% CI=-0.25, 0.21; I²=43.6%). The level of suggestiveness of the instructions influenced the effectiveness of OLPs for objective outcomes (p=.02), but not for self-reported outcomes.

Discussion: OLPs appear to be effective when examined in experimental studies. However, the small number of studies highlights the need for further research.

open-label placebo

experimental study

non-clinical

healthy

meta-analysis.

The term placebo is commonly defined as an inert substance or procedure^1,2. Placebos come in many forms such as sugar pills, saline injections, sham surgery, or verbal interventions. The placebo effect is best described as a biopsychosocial phenomenon that occurs subsequent to the application of a placebo³. Placebos are often applied as a control condition to determine a true intervention effect that is separate from influences that may be attributed to the psychosocial context^4–6. The finding that people’s conditions often improve in placebo groups has given rise to a new research domain, focusing entirely on the placebo¹. Aside from their scientific use, placebos are often administered in practical medicine to treat and/or appease patients⁷.

Placebos are predominantly administered deceptively or with insufficient transparency by physicians^1,8. Deception was long assumed to be an elementary component for the occurrence of the placebo effect⁴. However, this approach has been criticized for concealing the truth from participants and thus infringing on the concept of both informed consent and autonomy^9,10. A potential solution to this problem is the use of open-label placebos (OLPs). OLPs are placebos that are administered openly without deception (i.e., subjects are told that they will receive a placebo). A recent meta-analysis has shown that OLPs are effective when used in clinical trials¹¹.

Unlike clinical trials, experimental studies with non-clinical, healthy populations often make it possible to study the conditions and underlying mechanisms of OLP effects in more detail. However, a systematic summary of experimental studies of OLPs with healthy samples is yet to be published. The aim of this study is to conduct a systematic review and meta-analysis to determine the effectiveness of OLPs in experimental studies with non-clinical, healthy populations. We hypothesize that (1) OLPs are more effective as compared to no treatment (NT) or a hidden placebo (HP means that participants receive a placebo without knowing). Research has shown that OLPs are more effective when the instructions for administering the placebos are more suggestive^12–14. Therefore, we hypothesize that (2) higher levels of suggestiveness are associated with greater OLP effectiveness.

Study selection

Our search of five electronic databases yielded 3,573 records (Fig. 1). After removing duplicates, the remaining 2,352 titles and abstracts were screened, and 53 full texts were assessed for eligibility. We included 18 eligible articles comprising 20 studies (1,201 participants) in the systematic review. Two articles reported on two independent experiments, each experiment involving a different sample^15,16. We included these experiments as individual studies in the analyses (i.e., in these cases a single article was considered as two studies in the analyses). An overview of the studies that we excluded after the full-text screening can be seen in Table S1 (Supplementary Material). Three studies were excluded from the meta-analyses on account of their within-subject design^17–19, leaving a total of 17 studies.

Characteristics of studies, participants, and interventions

All of the included studies were Randomized Controlled Trials (RCTs), of which three were crossover studies^17–19 and 17 were parallel-group studies^{13,15,16,20−31}. In two parallel-group studies, a balanced placebo design was used^30,31. Three studies used a HP control condition^15,25, and all other studies used a NT control condition. Five studies included more than one OLP condition^13,16,20,22. All studies were reported in English and published between 2001 and 2021. Nine studies were conducted in Germany^{16,21,22, 26–29,31}, four in the United States^15,17,25, two in Switzerland^13,19, two in Australia^20,30, and one each in New Zealand²³, the Netherlands²⁴, and Brazil¹⁸.

Overall, 585 participants received an OLP intervention while 535 participants served as controls. In addition, 81 participants participated in crossover studies and therefore underwent both conditions. The sample sizes of individual studies ranged from 21 to 199.

With five studies the effect of an OLP on pain was primarily investigated^{13,19,22,25,26}. Three studies examined the influence of OLPs on well-being^16,20, three on stress^15,28, and two on arousal^29,31. OLPs have also been studied in relation to wound healing²³, sadness²¹, itchiness²⁴, test anxiety²⁷, cycling performance¹⁸, physiological recovery³⁰, and muscle strength and fatigue¹⁷. A wide range of placebos was used in the studies, including a nasal spray^15,21,26, pills^20,23,27,28, creams^13,22,25, capsules^17,18, bottled water¹⁶, decaffeinated coffee^29,31, verbal suggestions²⁴, acupuncture³⁰, and intravenous injection¹⁹. The degree of instructional suggestiveness was similarly distributed across studies, with five studies receiving a rating of 0 regarding the degree of instructional sugestiveness^{21,26,29−31}, six studies receiving a rating of 1^15–17,24, four studies receiving a rating of 2^18,22,23,25, and five studies receiving a rating of 3^{13,19,20,27,28}. Study characteristics are shown in Table 1. Detailed information on the instructions used when administering OLPs can be found in Table S2.

Risk of bias within studies

There was little variation among studies in terms of risk of bias (see Figure S1). Seventeen studies (85%) were rated as having “some concerns”^{13,15–22,24−30}, two studies (10%) as “low risk of bias”,^23,31 and one study (5%) as “high risk of bias”¹⁵. The randomization process was not adequately described in some studies^{15–19, 26}. Others presented biases arising from the selection of the reported results^{13,15–19,21,22,24−30} or unblinded outcome assessors^{15,16, 19–22,25,27,28}.

Data synthesis and analyses

Thirteen studies (772 participants, 401 in the experimental group and 371 controls) were included in the meta-analysis on self-reported outcomes. Analysis revealed a significant positive effect of OLPs compared to NT or HP (SMD = 0.43; 95% Cl = 0.28, 0.58; p < .01; Fig. 2). Heterogeneity was low and non-significant: χ² (12) = 12.94, p = .37, I² = 7.2%.

Eight studies (583 participants, 302 in the experimental group and 281 controls) were included in the meta-analysis on objective outcomes. We did not find a significant effect of OLPs compared to NT or HP in objective outcomes (SMD = -0.02; 95% Cl = -0.25, 0.21; p = .87; Fig. 3). Heterogeneity was low and non-significant: χ² (7) = 12.40, p = .09, I² = 43.6%. However, the results of the heterogeneity tests should be considered with caution due to low power³².

There were five studies with more than one OLP group^13,16,20,22, necessitating the selection of data for the analyses. With one study²⁰ we chose to combine the data from both OLP groups, as they differed only in the number of placebos given to the participants. With the remaining four studies that had more than one OLP group^13,16,22, we followed our approach of selecting whichever group had the most suggestive instructions (i.e., those that included the most statements from Kaptchuk et al.¹²). Specifically, with one study that had more than one OLP group¹³, we used the data from the group receiving an open-label placebo with rationale (ORP+) instead of the group receiving an open-label placebo without rationale (OPR-). With another study that had more than one OLP group²², we used the data from the group in which expectancies were evoked (OLP-E) rather than the group in which hope was induced (OLP-H) as the instructions in the OLP-E group were more closely aligned with Kaptchuk et al.¹². Two further studies had three OLP conditions¹⁶. From these studies we used the data from the OPR + group, as the other groups either did not receive a placebo rationale (OPR- group) or received an additional treatment in the form of a relaxation and imagination exercise (OPR + + group), which would have limited the comparability with the placebo groups in the other included studies.

We obtained change scores for all studies except for one¹⁵, for which only post-intervention scores were reported due to lack of baseline measurements. Another study reported post-intervention values¹³, but we considered these as change scores as they were adjusted for the corresponding baseline scores.

Subgroup analyses

For self-reported outcomes, the subgroups with different levels of suggestiveness did not differ significantly (Q(3) = 1.02; p = .80; see Table S3). For objective outcomes, the subgroups differed significantly (Q(3) = 9.49; p = .02). Specifically, the subgroup representing studies in which only one suggestive statement was communicated showed a significant OLP effect (p = .009), while the other subgroups did not (all ps > .05). However, only eight studies were included in this analysis, with two subgroups involving only one study each. Therefore, these results should be interpreted with caution.

An exploratory subgroup analysis examining the influence of the control condition (HP or NT) on OLP efficacy revealed that subgroups differed significantly for both self-reported (Q(1) = 5.26; p = 0.02; see Supplementary Table 4) and objective outcomes (Q(1) = 8.14; p < 0.01). In both analyses, the subgroup representing studies with HP controls descriptively yielded a larger effect size than the subgroup of studies with NT controls (see Table S4). However, the number of studies within subgroups was highly imbalanced. In the analysis of self-reported outcomes, 11 studies were in the subgroup of NT controls and only two studies were in the subgroup of HP controls. As for the analysis of objective outcomes, 7 studies were in the subgroup of NT controls and only one was in the subgroup of HP controls. Because of the substantial paucity of studies with HP controls, these results can only provide preliminary evidence and should be interpreted with great caution.

Reporting bias

The visual inspection of the funnel plot as well as Egger’s regression test indicated no evidence of publication bias either for self-reported outcomes (intercept = -1.85; 95% Cl -5.51, -1.81; p = .34; Fig. 4), or for objective outcomes (intercept = -2.38; 95% Cl -4.97, -0.22; p = .12; Fig. 5). For objective outcomes, however, results should be interpreted with caution as Egger's regression test had low power³³.

Certainty of evidence

The overall quality of evidence was low to very low. Specifically, the overall quality of evidence was low for objective pain, self-reported distress, and self-reported well-being. The overall quality of evidence was very low for self-reported pain and the sub-clusters of physiological outcomes. For details on the GRADE ratings, see Table S5.

Open-label placebos have recently become the subject of extensive research. To the best of our knowledge, this is the first systematic review and meta-analysis to investigate whether the OLP effect found in clinical trials holds true in experimental studies with non-clinical, healthy individuals. We included 20 studies, of which 17 were suitable for meta-analyses. Thirteen studies analyzed the OLP effect with self-reported outcomes and eight studies with objective outcomes. The results of the meta-analyses revealed a small to medium OLP effect for self-reported outcomes and no OLP effect for objective outcomes. Subgroup analyses revealed that the level of suggestiveness of the instructions influenced the effectiveness of OLPs for objective outcomes, but not for self-reported outcomes. However, due to the small number of studies, these results regarding suggestiveness should be viewed with caution. An exploratory subgroup analysis suggested that the use of HP as a comparator resulted in greater OLP efficacy compared with NT. However, because of the limited number of studies using HP control conditions, these results are only a vague indication and should be interpreted with caution.

Our finding of a significant OLP effect for self-reported outcomes is consistent with a meta-analysis on clinical trials¹¹. The aggregated SMD of that meta-analysis on clinical trials was larger what we found in experimental studies with non-clinical, healthy individuals (SMD 0.72 vs. 0.43). However, when trials with a high risk of bias in the meta-analysis on clinical trials are excluded in a sensitivity analysis¹¹, the aggregated SMD drops to 0.49 and is thus similar to what we found in experimental studies on healthy individuals. Therefore, our results demonstrate that OLPs are effective on self-reported outcomes regardless of health status and add to the evidence supporting the efficacy of OLPs.

With regards to the differences between self-reported and objective outcomes, our findings can be aligned to a Cochrane review of deceptive placebos in clinical trials³⁴. In that review and meta-analysis, data from 234 trials were summarized to compare placebos with no treatment. In relation to continuous measures, patient-rated outcomes (i.e., self-reported outcomes) showed a larger SMD than observer-rated outcomes (0.26 vs. 0.13). The small SMD for observer-rated outcomes was significant but that analysis included 49 trials, while our analysis only included 8 trials. It could be that experimental studies with objective outcomes might yield a small effect, but at the moment with the moderate number of relevant studies no such effect was found. Nevertheless, the present null effect raises the question of whether OLPs and deceptive placebos show the same effect pattern. Kaptchuk & Miller³ emphasize that, although objective changes have been demonstrated in many studies³⁵, deceptive placebos primarily affect self-reported and self-appraised symptoms. This may also be true for OLPs, but so far there is only one study demonstrating empirically objective changes for OLPs¹⁵. Thus, one could hypothesize that OLPs - in contrast to deceptive placebos - are based solely on suggestions and do not result in biological changes. To clarify this important question, there is a need for clinical trials that examine the effects of OLPs on objective outcomes as well as experimental studies assessing neurobiological and endocrinological mechanisms of OLPs similar to the studies on deceptive and hidden placebo^36,37.

The level of suggestiveness of the instructions only showed an association with the effectiveness of OLPs for objective outcomes. This association was contrary to our hypothesis. In this analysis the one subgroup that showed a significant OLP effect consisted of only one study (Table S3), which had a sample size three times as large as the average study assessing objective outcomes and therefore had a substantially higher weight in the analysis relative to the others. Thus, this finding should be interpreted very carefully. Overall, the observed results do not confirm our hypothesis that the OLP effect increases with higher instructional suggestiveness. However, this may change with the future publication of larger OLP studies that could make a reassessment of this issue possible.

This review has several strengths. First, two researchers performed the screening processes and the application of the assessment tools, ensuring high quality and reliability. Second, we used an established search strategy¹¹, increasing the comparability with other reviews on OLPs. Third, we did not use any date or language restrictions and searched as many as five electronic databases in order to be comprehensive. Moreover, in both meta-analyses, the heterogeneity was low and the statistical examination of funnel plot asymmetry revealed no indication of publication bias.

This review also has a number of limitations. First, the number of studies is small, and therefore, the stability of the results might be low. Second, the variability of outcomes in the included studies was large and unbalanced. While some studies were conducted in the context of pain research or well-being, only a few others were conducted in other contexts. This limits the generalizability of the overall findings and impedes the derivation of specific practical implications for the application of OLPs. The fact that the overall quality of evidence, as assessed by the GRADE approach, was low to very low underscores this point. Third, the variability of investigators of OLPs in experimental studies is limited and some authors have been involved in multiple publications, highlighting the need for further independent studies and replications. Fourth, we used an ad hoc developed tool based on the four suggestiveness statements of Kaptchuk et al.¹² to assess instructional suggestiveness. We followed this procedure because the instructions of most OLP studies are based on these four statements and because studies in which the instructions deviated slightly from Kaptchuk's framework could be easily subjected to this taxonomy. However, in isolated cases such as Meeuwis et al.²⁴, there were instructions that deviated to a greater extent from the statements of Kaptchuk et al. ¹², indicating that an assessment based on our approach might not have captured all aspects of suggestiveness. Fifth, we used the RoB 2 to assess the methodological quality of the included studies. The application of the RoB 2 in the context of OLPs required an adaptation of the tool. Owing to the specifics of non-deceptive placebos, knowledge of the intervention received cannot be separated from the open-label placebo effect¹¹. Therefore, we chose not to rate the knowledge of the intervention received as risk of bias. Future research would benefit from revised instruments that are adapted to the character of OLP interventions. Finally, one study was rated as having a high risk of bias. This study showed the largest effect size of all studies included in the meta-analysis of self-reported outcomes. Although the chi-squared test for heterogeneity was not significant in the analysis of self-reported outcomes and the I² statistic indicated negligible heterogeneity, this high-risk study may have inflated the assessment of the overall effect due to its methodological flaws.

This review adds to the evidence that OLPs offer the possibility of improving subjective symptoms without the need to lie about the placebo or to take active agents. Having said this, the opinions of many physicians towards OLPs differ greatly³⁸. Patients, on the other hand, seem more open to this novel use of placebos. For example, in a study of placebo acceptability in patients with chronic pain, respondents indicated that they preferred open-label placebos to deceptive placebos³⁹. The patients' desire for transparency aligns with the calls of leading placebo researchers who oppose the use of deceptive placebos in clinical practice⁴⁰. Based on their confirmed efficacy in initial studies and only minor side effects, we suggest using OLPs instead of ethically questionable deceptive placebos, provided the effects of OLPs are similar or larger. In summary, OLPs represent an ethically defensible approach that respects participants’ and patients’ autonomy and right to informed consent.

To the best of our knowledge, this is the first systematic review and meta-analysis to examine the effect of OLPs in experimental studies with non-clinical, healthy individuals. The results suggest that OLPs are effective for self-reported outcomes but not for objective outcomes. The degree of instructional suggestiveness seems to influence the effectiveness of OLPs only for objective outcomes, but not in the way it was expected. These findings need to be confirmed in future research based on a larger number of primary studies. This would also enable an adequate statistical investigation of the influence of different control conditions on the effectiveness of OLPs.

We preregistered this review at the Open Science Framework (OSF) on April 12, 2021 https://doi.org/10.17605/OSF.IO/4CAFQ. The review adhered to the checklist of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA)⁴¹.

Eligibility criteria

Eligible studies had to meet the following criteria: (1) Population: We included studies with non-clinical populations. Studies with clinical populations were excluded. (2) Intervention: We included OLP interventions regardless of their specific application. (3) Comparison: We considered either a no-treatment control condition (NT) or a hidden placebo (HP) control condition. (4) Outcome: We included studies that measured the efficacy of OLPs on any given scale. Since the aim of this study was to investigate the effect of OLPs on a meta-level (i.e., across various outcomes), we did not apply restrictions to the types of outcomes. (5) Design: We included RCTs and excluded all other study designs.

Information sources and search strategy

We screened five electronic bibliographic databases comprising all entries from the database inception to April 15, 2021. We did not apply any language restrictions. We searched for studies using Medline via PubMed (1965 to April 15, 2021), PsycINFO via EBSCO (1967 to April 15, 2021), PSYNDEX via EBSCO (1977 to April 15, 2021), Web of Science Core Collection (1945 to April 15, 2021), and The Cochrane Central Register of Controlled Trials (CENTRAL, The Cochrane Library, Wiley), Issue 4 of April 12, 2021. Due to its composite nature, CENTRAL does not have an inception date. However, we did not apply any date restrictions and used the latest issue available. In addition, we screened the Journal of interdisciplinary placebo studies DATABASE (JIPS, https://jips.online/).

We used a search strategy similar to von Wernsdorff et al.¹¹. The search terms served the purpose of describing the OLP intervention in more detail. Therefore, in addition to terms such as "placebo", we used synonyms for "open-label", such as “non blind” or “without deception”. Since the aim of this study was to investigate the effect of OLPs on a meta-level, we did not specify outcomes and control conditions in the search strings. In addition, we used wildcards and variant forms of spelling to find as many studies as possible. The search strings are in Supplementary Tables S6-S10. Slight variations between the search strings are due to different proximity operators among the databases. We compiled all records identified in the databases in the reference management software Zotero 5.0.96.2 (Corporation for Digital Scholarship, Vienna, Virginia) and removed duplicates. We conducted both backward and forward citation searches of all included studies and important reviews on OLPs^11,42 using Web of science and PsychINFO.

Study selection and data extraction

Two researchers (LS and PDS) independently screened titles, abstracts, and full texts for inclusion. Title and abstract screening were carried out using the systematic review software Rayyan (Rayyan QCRI, Doha, Qatar). Disagreements were resolved through discussion. If no consensus could be reached, JCF and SS were consulted. The chance-corrected agreement between raters after the full text screening was substantial (κ = 0.62). In cases where eligible studies did not report the necessary information to compute effect sizes, we contacted the authors of the studies. If the authors did not respond or were unable to provide the data, these studies were excluded.

The same two researchers, who selected the studies, independently extracted the data. Again, disagreements were resolved through discussion. We extracted data on: author, year, country of trial, study design, sample size, control condition, intervention characteristics, the exact wording of instructions given to the participants, as well as the type and number of outcomes into a spreadsheet. For outcomes, we extracted the means, sample sizes, and standard deviations. If reported, this was done for change scores, otherwise for both pre- and post-intervention scores. For studies where only the standard error was reported, we transformed the standard error into the standard deviation according to the procedure outlined in the Cochrane Handbook⁴³.

As stated in the preregistration, we extracted the primary outcome as specified in the individual studies. If multiple primary outcomes were specified in the individual studies, we extracted all primary outcomes. If no outcome was designated as primary, we extracted all outcomes. For studies that included multiple control conditions, we only extracted data on the OLP condition and the corresponding comparator (i.e., NT or HP).

Study risk of bias assessment

We used the revised Cochrane risk of bias tool for randomized trials (RoB 2) to assess the risk of bias in primary studies. Five domains of bias are assessed using the RoB 2, namely biases arising from (1) the randomization process, (2) deviations from intended interventions, (3) missing outcome data, (4) measurement of the outcome, and (5) selection of the reported result⁴⁴. Ratings for each domain range from “low risk of bias”, to “some concerns”, to “high risk of bias”. Finally, the ratings of the individual domains are aggregated into an overall rating, which in most cases is equivalent to the worst rating in any of the domains⁴⁴.

Given the specific context of OLPs, we agree with von Wernsdorff et al.¹¹ that a lack of blinding of participants should not result in an increased risk of bias rating. They argue that knowledge of one's group assignment is imperative and cannot be separated from the placebo effect in this particular intervention. Thus, we decided to rate the risk of bias in the domains (2) and (4) (i.e., the risk of bias due to unblinding) as not worse than “some concerns”. Risk of bias assessments were carried out independently by LS and PDS, with discrepancies resolved through discussion with JCF and SS.

Data synthesis and analyses

Since knowledge of the received intervention might influence self-reported outcomes, we conducted two separate meta-analyses, one for self-reported outcomes and one for objectively recorded outcomes (i.e., physiological or behavioral variables). The meta-analyses were conducted using the meta package of R, version R 4.0.3. Since all studies reported continuous data, we chose the standardized mean difference (SMD) as the summary outcome. We used Hedges’ g, which corrects for small sample bias⁴⁵. When both pre- and post-intervention values were reported, we first calculated change scores by subtracting pre- from post-intervention scores. We then standardized the difference in change scores between groups using the pooled pre-intervention SD to calculate the corresponding SMDs.

If there were multiple outcomes within one study, we calculated SMDs for all of these outcomes and averaged them^46,47. This approach ensured that there was no bias due to selective choice of outcome depending on effect size and conformity to the hypothesis.

When there were multiple OLP conditions within a study, we proceeded as follows: Our primary goal was to obtain the maximum OLP effect that could be realized experimentally. Since we assumed that suggestive instructions would amplify the placebo effect, we always chose whichever condition was most suggestive. This was operationalized by selecting the condition where most of the instructional statements from Kaptchuk et al.¹² were utilized. Kaptchuk et al.¹² were among the first to conduct a clinical trial of OLP and used a rationale (i.e., statements explaining the placebo effect) of four statements with positive framing to optimize placebo response. These statements imply 1) that the placebo effect is powerful, 2) that the body is automatically responding to placebos, 3) that it does not require a positive attitude, and 4) that taking the placebo faithfully is crucial. This or similar rationales were applied by many other researchers¹¹.

Studies with crossover designs were not included in the meta-analyses as the parameters required for the computation of the effect sizes were not reported. An alternative approach of analyzing crossover studies is to handle study groups as if they were parallel groups. However, this approach is not recommended by Cochrane as this may lead to a unit-of-analysis error⁴⁸.

Once the effects of the individual studies were calculated, they were aggregated into an overall SMD. We employed a random effects model by applying the inverse-variance weighting method⁴⁵. To correct for differences in the direction of the scale, the means of some studies were multiplied by -1⁴⁸. Heterogeneity between studies was assessed using the chi-square test and the I² statistic. I² values above 25% are interpreted as low, above 50% as moderate, and above 75% as high heterogeneity⁴⁹.

We conducted subgroup analyses to examine the influence of the suggestiveness of the instructions on the efficacy of OLPs. To assess the extent of the suggestiveness of the instructions in OLPs, we developed a tool based on the four statements applied by Kaptchuk et al¹². These statements are given along with the administration of the open-label placebos. However, the placebos in most experimental studies included in our review were administered only once and under the supervision of an experimenter. Therefore, we omitted the fourth statement and formed four subgroups depending on the number of statements utilized in the instructions (ranging from 0 = “no statement utilized” to 3 = “all statements utilized”), with higher values indicating greater suggestiveness. We believe this approach to be reasonable, as many studies investigating OLPs have adopted the instructions from Kaptchuk et al.¹² and varied the number of statements implemented in the instructions. For the subgroup analyses, we first calculated the pooled effect for each subgroup and then used a Q-test to examine whether effect sizes differed between subgroups⁵⁰.

We also conducted exploratory subgroup analyses to examine whether the efficacy of OLPs differed depending on the control condition used (i.e., NT or HP). We used the same statistical procedures as before. However, these analyses were specified a posteriori and therefore not reported in the preregistration.

All tests were two-tailed.

Reporting bias assessment

We assessed publication bias by visually inspecting funnel plots for asymmetry. In funnel plots, the SMDs of the individual studies are plotted against their standard error. In addition, we carried out a statistical assessment of funnel plot asymmetry using Egger's regression test, which regresses the SMDs against their standard error⁵¹. We did not assess the risk for time-lag bias, as research on OLPs is in its early stages and the interest in non-clinical, healthy populations has arisen only recently.

Certainty assessment

We used the Grading of Recommendations Assessment, Development and Evaluation (GRADE)⁵² approach to assess the overall quality of the evidence. At the beginning of the assessment process, the overall quality of an RCT is rated as high and can subsequently be down- or upgraded based on eight dimensions: (1) risk of bias, (2) inconsistency, (3) imprecision, (4) indirectness, (5) publication bias, (6) dose response, (7) large effects, and (8) confounding. Based on the ratings of each dimension, the overall quality of evidence is rated as “high”, “moderate”, “low”, or “very low”. GRADE is performed for specific outcomes. However, due to the large number of different outcomes, we decided to form five clusters, in which similar outcomes were grouped together: self-reported pain, objective pain, self-reported positive well-being, self-reported distress, and physiological outcomes. For physiological outcomes, we formed three sub-clusters, each containing a single study, to account for the heterogeneity in physiological outcomes. In our approach, a study may be represented in several clusters due to different outcome variables, but in each cluster only once. Assessments were conducted by two independent raters (LS and PDS), with discrepancies resolved through discussion.

Data availability

Data extracted from the included studies are available in a standardized Excel file, which can be found in the Supplementary Material.

Author contributions

SS developed the study concept. LS was responsible for study execution and writing of the first draft of the manuscript, with assistance by SS and JCF. LS and PDS conducted the title, abstract, and full-text screening, extracted data, and rated the risk of bias as well as the overall quality of evidence (GRADE). JCF performed the statistical analysis. ASG assisted in the study execution. All authors critically revised the manuscript.

Competing interests

The authors declare no competing interests.

Finniss, D. G., Kaptchuk, T. J., Miller, F. & Benedetti, F. Biological, clinical, and ethical advances of placebo effects. The Lancet 375, 686–695 (2010).
Louhiala, P. What do we really know about the deliberate use of placebos in clinical practice? J Med Ethics 38, 403–405 (2012).
Kaptchuk, T. J. & Miller, F. G. Placebo effects in medicine. N Engl J Med 373, 8–9 (2015).
Beecher, H. K. The powerful placebo. JAMA 159, 1602 (1955).
Kaptchuk, T. J. Powerful placebo: the dark side of the randomised controlled trial. The Lancet 351, 1722–1725 (1998).
Schmidt, S. Context matters! what is really tested in an RCT? BMJ EBM bmjebm-2022-111966 (2022) doi:10.1136/bmjebm-2022-111966.
Specker Sullivan, L. More than consent for ethical open-label placebo research. J Med Ethics 47, e7–e7 (2021).
Stafford, N. German doctors are told to have an open attitude to placebos. BMJ 342, d1535–d1535 (2011).
Annoni, M. The ethics of placebo effects in clinical practice and research. in International Review of Neurobiology vol. 139 463–484 (Elsevier, 2018).
Miller, F. G. & Colloca, L. The legitimacy of placebo treatments in clinical practice: evidence and ethics. The American Journal of Bioethics 9, 39–47 (2009).
von Wernsdorff, M., Loef, M., Tuschen-Caffier, B. & Schmidt, S. Effects of open-label placebos in clinical trials: a systematic review and meta-analysis. Sci Rep 11, 3855 (2021).
Kaptchuk, T. J. et al. Placebos without deception: a randomized controlled trial in irritable bowel syndrome. PLoS ONE 5, e15591 (2010).
Locher, C. et al. Is the rationale more important than deception? a randomized controlled trial of open-label placebo analgesia. Pain 158, 2320–2328 (2017).
Schaefer, M., Sahin, T. & Berstecher, B. Why do open-label placebos work? a randomized controlled trial of an open-label placebo induction with and without extended information about the placebo effect in allergic rhinitis. PLoS ONE 13, e0192758 (2018).
Guevarra, D. A., Moser, J. S., Wager, T. D. & Kross, E. Placebos without deception reduce self-report and neural measures of emotional distress. Nat Commun 11, 3785 (2020).
Rathschlag, M. & Klatt, S. Open-label placebo interventions with drinking water and their influence on perceived physical and mental well-being. Front. Psychol. 12, 658275 (2021).
Swafford, A. P. et al. No acute effects of placebo or open-label placebo treatments on strength, voluntary activation, and neuromuscular fatigue. Eur J Appl Physiol 119, 2327–2338 (2019).
Saunders, B. et al. “I put it in my head that the supplement would help me”: open-placebo improves exercise performance in female cyclists. PLoS ONE 14, e0222982 (2019).
Schneider, T., Luethi, J., Mauermann, E., Bandschapp, O. & Ruppen, W. Pain response to open label placebo in induced acute pain in healthy adult males. Anesthesiology 132, 571–580 (2020).
El Brihi, J., Horne, R. & Faasse, K. Prescribing placebos: an experimental examination of the role of dose, expectancies, and adherence in open-label placebo effects. Annals of Behavioral Medicine 53, 16–28 (2019).
Glombiewski, A. J., Julia, R., Julia, W., Lea, R. & Winfried, R. Placebo mechanisms in depression: an experimental investigation of the impact of expectations on sadness in female participants. Journal of Affective Disorders 256, 658–667 (2019).
Kube, T. et al. Deceptive and nondeceptive placebos to reduce pain: an experimental study in healthy individuals. The Clinical Journal of Pain 36, 68–79 (2020).
Mathur, A., Jarrett, P., Broadbent, E. & Petrie, K. J. Open-label placebos for wound healing: a randomized controlled trial. Annals of Behavioral Medicine 52, 902–908 (2018).
Meeuwis, S. et al. Placebo effects of open-label verbal suggestions on itch. Acta Derm Venerol 98, 268–274 (2018).
Mundt, J. M., Roditi, D. & Robinson, M. E. A comparison of deceptive and non-deceptive placebo analgesia: efficacy and ethical consequences. ann. behav. med. 51, 307–315 (2017).
Rief, W. & Glombiewski, J. A. The hidden effects of blinded, placebo-controlled randomized trials: an experimental investigation. Pain 153, 2473–2477 (2012).
Schaefer, M. et al. Open-label placebos reduce test anxiety and improve self-management skills: a randomized-controlled trial. Sci Rep 9, 13317 (2019).
Schaefer, M., Hellmann-Regen, J. & Enge, S. Effects of open-label placebos on state anxiety and glucocorticoid stress responses. Brain Sciences 11, 508 (2021).
Schneider, R. et al. Effects of expectation and caffeine on arousal, well-being, and reaction time. Int. J. Behav. Med. 13, 330–339 (2006).
Urroz, P., Colagiuri, B., Smith, C. A., Yeung, A. & Cheema, B. S. Effect of acupuncture and instruction on physiological recovery from maximal exercise: a balanced-placebo controlled trial. BMC Complement Altern Med 16, 227 (2016).
Walach, H., Schmidt, S., Bihr, Y.-M. & Wiesch, S. The effects of a caffeine placebo and experimenter expectation on blood pressure, heart rate, well-being, and cognitive performance. European Psychologist 6, 15–25 (2001).
Deeks, J. J., Higgins, J. P., Altman, D. G. & Group, C. S. M. Analysing data and undertaking meta-analyses. Cochrane handbook for systematic reviews of interventions 241–284 (2019).
Sterne, J. A. C., Egger, M. & Moher, D. Addressing reporting biases. In: Higgins JPT, Green S (editors). Cochrane Handbook for Systematic Reviews of Interventions. Version 5.1. 0 (updated March 2011). The Cochrane Collaboration, 2011. Available from handbook. cochrane. org (2011).
Hróbjartsson, A. & Gøtzsche, P. C. Placebo interventions for all clinical conditions. Cochrane Database of Systematic Reviews (2010) doi:10.1002/14651858.CD003974.pub3.
Benedetti, F. Placebo-induced improvements: how therapeutic rituals affect the patient’s brain. Journal of Acupuncture and Meridian Studies 5, 97–103 (2012).
Amanzio, M. & Benedetti, F. Neuropharmacological dissection of placebo analgesia: expectation-activated opioid systems versus conditioning-activated specific subsystems. J. Neurosci. 19, 484–494 (1999).
Wager, T. D. et al. Placebo-induced changes in fMRI in the anticipation and experience of pain. Science 303, 1162–1167 (2004).
Bernstein, M. H. et al. Primary care providers’ use of and attitudes towards placebos: an exploratory focus group study with US physicians. Br J Health Psychol 25, 596–614 (2020).
Wolter, T. & Kleinmann, B. Placebo acceptability in chronic pain patients: more dependent on application mode and resulting condition than on individual factors. PLoS ONE 13, e0206968 (2018).
Evers, A. W. M. et al. What should clinicians tell patients about placebo and nocebo effects? practical considerations based on expert consensus. Psychother Psychosom 90, 49–56 (2021).
Page, M. J. et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Syst Rev 10, 89 (2021).
Charlesworth, J. E. G. et al. Effects of placebos without deception compared with no treatment: a systematic review and meta-analysis. J Evid Based Med. 10, 97–107 (2017).
Higgins, J. P. T. & Deeks, J. J. Selecting studies and collecting data. In: Higgins JPT, Green S (editors), Cochrane Handbook for Systematic Reviews of Interventions Version 5.1. 0 (updated March 2011). The Cochrane Collaboration, 2011. Available from handbook. cochrane. org (2011).
Sterne, J. A. C. et al. RoB 2: a revised tool for assessing risk of bias in randomised trials. BMJ l4898 (2019) doi:10.1136/bmj.l4898.
Döring, N. & Bortz, J. Forschungsmethoden und evaluation in den sozial- und humanwissenschaften. (Springer, 2016). doi:10.1007/978-3-642-41089-5.
López-López, J. A., Page, M. J., Lipsey, M. W. & Higgins, J. P. T. Dealing with effect size multiplicity in systematic reviews and meta‐analyses. Res Syn Meth 9, 336–351 (2018).
Marín-Martínez, F. & Sánchez-Meca, J. Averaging dependent effect sizes in meta-analysis: a cautionary note about procedures. Span. J. Psychol. 2, 32–38 (1999).
Higgins, J. P. et al. Cochrane handbook for systematic reviews of interventions. (John Wiley & Sons, 2019).
Higgins, J. P. T. Measuring inconsistency in meta-analyses. BMJ 327, 557–560 (2003).
Harrer, M., Cuijpers, P., Furukawa, T. A. & Ebert, D. D. Doing meta-analysis with R: a hands-on guide. (Chapman and Hall/CRC, 2021). doi:10.1201/9781003107347.
Egger, M., Smith, G. D., Schneider, M. & Minder, C. Bias in meta-analysis detected by a simple, graphical test. BMJ 315, 629–634 (1997).
Guyatt, G. H. et al. GRADE: an emerging consensus on rating quality of evidence and strength of recommendations. BMJ 336, 924–926 (2008).

Table 1

Characteristics of included studies

Authors	Year	Country	Context	N	Intervention	Control	Suggestiveness rating of instruction	Self-reported outcome used for meta-analysis	Objective outcome used for meta-analysis
El Brihi et al.²⁰	2019	AU	Well-being	92	OLP (n = 61)	NT (n = 27)	3	Depression Anxiety Stress Scale 21 (DASS-21) Subjective Health Complaints inventory (SHC) Pittsburgh Sleep Quality Index (PSQI) Warwick-Edinburgh Mental Well-being Scale (WEMWBS)	Not applicable

Glom-biewski et al.²¹	2019	GER	Sadness	128	OLP (n = 32)	NT (n = 32)	0	Sadness subscale from the Positive and Negative Affect Schedule-Expanded Form (PANAS-X)	Not applicable

Guevarra et al. a)¹⁵	2020	US	Emotional distress	62	OLP (n = 29)	HP (n = 33)	1	Emotional distress on a nine-point Likert scale	Not applicable

Guevarra et al. b)¹⁵	2020	US	Emotional distress	198	OLP (n = 99)	HP (n = 99)	1	Not applicable	Sustained late positive potential

Kube et al.²²	2020	GER	Pain	100	OLP-E (n = 25)	NT (n = 25)	2	Pain intensity (VAS) Pain unpleasantness (VAS)	Pain tolerance (°C)

Locher et al.¹³	2017	CH	Pain	151	OPR+ (n = 37)	NT (n = 40)	3	Pain intensity (VAS) Pain unpleasantness (VAS)	Pain tolerance (°C)

Mathur et al.²³	2018	NZ	wound healing	65	OLP (n = 32)	NT (n = 33)	2	Not applicable	Percentage area of the wound healed after 7 and 10 days

Table 1 continued

Characteristics of included studies

Authors	Year	Country	Context	N	Intervention	Control	Suggestiveness rating of instruction	Self-reported outcome used for meta-analysis	Objective outcome used for meta-analysis
Meeuwis et al.²⁴	2017	NL	Itch	92	OLP (n = 45)	NT (n = 46)	1	Itch (NRS)	Not applicable

Mundt et al.²⁵	2016	US	Pain	75	OLP (n = 25)	HP (n = 25)	2	Pain intensity (VAS)	Not applicable

Rath-schlag and Klatt a)¹⁶	2021	GER	Well-being	68	OPR+ (n = 18)	NT (n = 16)	1	Acute Recovery and Stress Scale (ARSS, all of the 8 subscales) Questionnaire for Assessing Subjective Physical Well-Being (FEW-16, all of the 4 subscales)	Not applicable

Rath-schlag and Klatt b)¹⁶	2021	GER	Well-being	75	OPR+ (n = 18)	NT (n = 19)	1	Acute Recovery and Stress Scale (ARSS, all of the 8 subscales) Questionnaire for Assessing Subjective Physical Well-Being (FEW-16, all of the 4 subscales)	Not applicable

Rief and Glom-biewski²⁶	2012	GER	Pain	134	OLP (n = 41)	NT (n = 20)	0	Not applicable	Pain threshold (°C)

Saunders et al.¹⁸	2019	BRA	Cycling time trial	28	OLP (n = 28)	NT (n = 28)	2	Not applicable	Not included in analysis

Table 1 continued

Characteristics of included studies

Authors	Year	Country	Context	N	Intervention	Control	Suggestiveness rating of instruction	Self-reported outcome used for meta-analysis	Objective outcome used for meta-analysis
Schaefer et al.²⁷	2019	GER	Test anxiety	58	OLP (n = 31)	NT (n = 27)	3	Brief German test anxiety inventory (PAF) Questionnaire for Measuring Resources and Self-Management Skills (FERUS)	Not applicable

Schaefer et al.²⁸	2021	GER	Stress	53	OLP (n = 24)	NT (n = 29)	3	State dimension of the State-Trait-Anxiety-Inventory (STAI-S) Positive and Negative Affect Scale (PANAS) Current stress (VAS)	Not applicable

Schnei-der et al.²⁹	2006	GER	Arousal	45	OLP (n = 15)	NT (n = 15)	0	Multi-dimensional Well-Being Questionnaire (all of the 3 dimensions)	Systolic blood pressure (mmHg) Diastolic blood pressure (mmHg) Heart rate (bpm) Reaction time (ms)

Schnei-der et al.¹⁹	2020	CH	Pain	32	OLP (n = 32)	NT (n = 32)	3	Not included in analysis	Not applicable

Swaffordet al.¹⁷	2019	US	Muscular strength and fatigue	21	OLP (n = 21)	NT (n = 21)	1	Not included in analysis	Not included in analysis

Table 1 continued

Characteristics of included studies

Authors	Year	Country	Context	N	Intervention	Control	Suggestiveness rating of instruction	Self-reported outcome used for meta-analysis	Objective outcome used for meta-analysis
Urroz et al.³⁰	2016	AU	Physio-logical recovery	60	OLP (n = 12)	NT (n = 12)	0	Not applicable	Systolic blood pressure (mmHg) Diastolic blood pressure (mmHg) Heart rate (bpm) volume of oxygen consumption in ml·min respiratory rate (breaths/min) blood lactate (mmol/l)

Walach et al.³¹	2001	GER	Arousal	156	OLP (n = 41)	NT (n = 37)	0	Basle Well-Being Scale	Systolic blood pressure (mmHg) Diastolic blood pressure (mmHg) Heart rate (bpm)

Note. For reasons of clarity and comprehensibility, only the intervention and control groups relevant to this review were listed with their corresponding group sizes. Four studies implemented multiple OLP groups, which is why the specific name of the group used for this study was provided here. Abbreviations: OLP = open-label placebo, NT = no treatment, HP = hidden placebo, AU = Australia, BRA = Brazil, CH = Switzerland, GER = Germany, NZ = New Zealand, US = United States, VAS = visual analogue scale, NRS = numeric rating scale, N = total sample size.

No competing interests reported.

Download PDF

Journal Publication

published 04 Mar, 2023

Read the published version in Scientific Reports →

Editorial decision: Major revision
21 Nov, 2022
Reviews received at journal
05 Nov, 2022
Reviewers agreed at journal
14 Oct, 2022
Reviewers invited by journal
11 Oct, 2022
Editor assigned by journal
11 Oct, 2022
Editor invited by journal
27 Sep, 2022
Submission checks completed at journal
27 Sep, 2022
First submitted to journal
22 Sep, 2022

You are reading this latest preprint version

Open-label placebos: A systematic review and meta-analysis of experimental studies with non-clinical samples

Status:

Journal Publication

Version 1

Abstract

Figures

Introduction

Results

Study selection

Characteristics of studies, participants, and interventions

Risk of bias within studies

Data synthesis and analyses

Subgroup analyses

Reporting bias

Certainty of evidence

Discussion

Methods

Eligibility criteria

Information sources and search strategy

Study selection and data extraction

Study risk of bias assessment

Data synthesis and analyses

Reporting bias assessment

Certainty assessment

Declarations

Data availability

References

Tables

Additional Declarations

Supplementary Files

Status:

Journal Publication

Version 1