The aim of this study was to replicate that cognitive immunization modulates expectation update in depression, and to explore the role of other potentially relevant factors (interpretation, memory). The manipulation check indicated that the manipulation of cognitive immunization was only partly successful: while the IIG reported less cognitive immunization than the two control groups, as intended, the IPG did not differ from the two control groups in cognitive immunization, although cognitive immunization was intended to be increased in this condition. The manipulation therefore seems to have partially failed. Furthermore, even the successfully lowered engagement in cognitive immunization in the IIG was not reflected by significant group differences in expectation update. Thus, the present study failed to replicate that the modulation of cognitive immunization leads to differences in expectation update in depression (Kube, Rief, et al., 2019). A number of reasons might account for this failure.
In contrast to (Kube, Rief, et al., 2019), who examined a sub-clinically depressed (BDI-II > 9) student sample, the current study used an inpatient clinical sample with relatively high symptom burden and diverse educational degrees. These important sample differences could account for the failure of the manipulation as the written manipulation texts were quite complex and might have been too difficult to understand for a severely impaired participant. More specifically, the manipulation text focused quite heavily on the good vs. bad criterion validity of the TEMIT performance test, which may have required deductive reasoning to understand its implications for the participants. On the other hand, it should be noted that the same manipulation text was successfully used in one experimental condition of another previous study with a similarly impaired clinical sample to inhibit cognitive immunization and promote expectation update (Kube, Rief, et al., 2019).
Another possible explanation refers to the fact that in the current study we conducted a diagnostic interview before participants worked on the performance test. This interview, which was perceived as pleasant by many patients, might have reduced negative affect, which could have resulted in greater openness to integrating unexpectedly positive performance feedback, as suggested by previous research (Kube et al., 2023; Kube & Glombiewski, 2020).
Würtz and colleagues (2024) found that a lack of group differences was most likely attributable to regression to the mean and depressive symptoms being associated with less positive baseline expectations, which leave more room for an increase of positive expectations. In our study, we also found a small correlation of baseline expectations and depressive symptoms, However, as controlling for baseline expectations did not change the results in our study, these factors do not seem to make a difference in our case.
Further possible explanations concern the characteristics of the manipulation. First the manipulation was presented subsequently to the presentation of the feedback. If immunization was an automated process immediately following expectation violation, the manipulation might have been carried out too late to influence it. Second, the manipulation consisted of a scientific-factual argumentation presented as a text. As research on open label placebo, as well as on therapy expectations, suggests that the warmth of a presenter is key to the effectiveness of such a rationale (Gaab et al., 2019; Seewald & Rief, 2023), it is possible that including a warm presenter would have enhanced the effectiveness of our manipulation.
Beyond that, it could be argued that labelling the test as invalid makes the feedback less relevant for the self-concept and - as a paradoxical consequence - easier to integrate. This would also fit in with the finding that shows that feedback that is too positive is less integrable (Kube et al., 2021; Würtz et al., 2024). Nevertheless, this explanation is unlikely to account for the failed replication, as otherwise labelling the test as invalid would also have to lead to a higher integration of the feedback.
In terms of the feedback that was used in the present study, these findings could very well be relevant. As they show that moderately positive feedback ("you are among the best 15%") leads to the strongest change in expectations and the lowest degree of cognitive immunization (Kube et al., 2021; Würtz et al., 2024). Thus, more extreme positive feedback (“you are among the best 1%”) could be more suitable to modulate cognitive immunization. However, this does not explain the discrepancy with Kube, Rief, and colleagues (2019), but as detailed out before the sample used in this previous study may had less difficulties in understanding the manipulation, which may be why cognitive manipulation was manipulated successfully despite the moderately positive feedback.
Overall, participants showed a positive expectation update. This may contradict the results from previous research showing that depression is related to little expectation update in response to positive information (as reviewed by Kube, 2023). However, since there was no other population as a control sample, this positive update is difficult to interpret regarding its magnitude, and it could be just a measurement repetition effect.
Finally, replication of empirical social science results is - due to possibly high false failure rates, even in well powered samples- nothing that can be automatically expected, and non-replication in one trial does not automatically imply that the effect is not robust. (Schauer & Hedges, 2021; Stanley & Spence, 2014). On the other hand, the robustness of the effect can only be evaluated in the long run and therefore further research is needed.
Strengths, Limitations and future directions
Strengths of our work can be seen in the sufficiently powered clinical sample; the use of a previously validated paradigm; the use of two control groups; the conduction of manipulation checks; the analysis of cognitive immunization in relation to other cognitive factors; and the pre-registration. Notwithstanding these merits, the present studies also have limitations that need to be considered.
A major limitation is that we used quite complex manipulation texts and did not check whether these were comprehended correctly. Thus, we cannot clarify whether the manipulation failed because the participants did not understand the relatively complex wording and content. The fact, that the manipulation partially failed speaks to this possibility. Additionally, the manipulation texts focused only on one immunization strategy (i.e., questioning the validity of expectation-disconfirming information). Therefore, future research may use simpler wording, focus on additional immunization strategies (e.g., considering new evidence to be an exception), and check whether the manipulation is understood correctly. Presenting the manipulation as a video instead of a text may also improve comprehensibility. Moreover, we did not control for participants’ state affect and therefore cannot rule out whether the diagnostic interview beforehand induced positive affect and facilitated the expectation update. A further limitation pertains to the TEMINT, which might not have been relevant enough to participants and thus have led to limited engagement with the feedback and the subsequent manipulations. Finally, the assessment of expectations via explicit questions using numeric rating scales can be questioned as method to assess expectation change.