Feedback-induced dispositional changes in risk preferences

doi:10.21203/rs.3.rs-4031736/v1

Download PDF

Article

Feedback-induced dispositional changes in risk preferences

https://doi.org/10.21203/rs.3.rs-4031736/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Contrary to the normative decision-making standpoint, empirical studies have repeatedly reported that risk preferences are affected by the disclosure of choice outcomes (feedback). Although no consensus has yet emerged regarding the properties and mechanisms of this effect, a widespread and intuitive hypothesis is that repeated feedback affects risk preferences by means of learning, which alters the representation of subjective probabilities. Here, we ran a series of seven experiments, tailored to decipher the effects of feedback on risk preferences. Our results indicate that the presence of feedback consistently increases risk-taking, even when the risky option is less advantageous. Crucially, risk-taking increases immediately after the instructions, before participants experience any feedback and trial-by-trial analysis directly falsified learning accounts. These results challenge the learning account in favor of a dispositional effect, induced by the anticipation of feedback information. Epistemic curiosity and regret avoidance may drive this effect in partial and complete feedback conditions.

Biological sciences/Psychology/Human behaviour

Biological sciences/Neuroscience/Reward

decision-making

feedback

risk

regret

curiosity

learning

Traditionally, empirical investigations of decision-making under risk have mostly been carried out in behavioral setups limited to one-shot description-based choice problems (Allais, 1953; Kahneman & Tversky, 1979): unique binary choices between mutually exclusive options, where relevant information (i.e., prospective outcomes and probabilities) is explicitly displayed, and considered known to the decision maker. This experimental setup matches the scope and limits of both normative and descriptive decision theories, which are generally silent about the effects of feedback and choice repetition (Bell, 1982; Loomes & Sugden, 1982; Wakker & Tversky, 1993). Arguably, although both theoretically and empirically convenient, this one-shot description-based framework is not representative of the vast majority of decision situations that one faces every day. Most decision problems are recurrent, and, very often, one gets to know the outcome of one’s choice (partial feedback) – and sometimes also the outcome of the forgone option (complete feedback) (for some examples look Erev et al., 2022; Fantino & Navarro, 2012; Plonsky & Teodorescu, 2020; Weiss-Cohen et al., 2021). To address those shortcomings, repetitions and feedback have been gradually incorporated into the study of human decision-making under risk over the last couple of decades (Garcia et al., 2021; Hertwig & Erev, 2009; Hertwig & Wulff, 2022; Lejarraga & Hertwig, 2021). This experimental innovation revealed that, in contrast to the normative dictate, human choices and risk preferences elicited in repeated decisions under risk do appear to change depending on the presence versus absence of feedback (eg Erev et al., 2017; Jessup et al., 2008; Marchiori et al., 2015; Newell & Rakow, 2007; Rigoli et al., 2019; Weiss-Cohen et al., 2016a).

A widespread and intuitive hypothesis concerning the effect of feedback on risk preferences proposes that outcome information modifies the decision-maker’s subjective representation of probabilities. Indeed, from studies involving one-shot decisions, it clearly appears that individuals behave as if their subjective representation of probabilities is distorted (overweighting of rare events, underweighting of common events; (Kahneman & Tversky, 1979; Prelec, 1998)). Thereby, the realized frequency of the outcomes received in the context of repeated decisions can be used to update (if not correct) the subjective beliefs concerning their probabilities, ultimately affecting one’s preferences and choices (Marchiori et al., 2015). We shall refer to this category of accounts as the learning hypothesis. Because it rationalizes the integration of feedback in future choices through experience-based updating of initially distorted subjective probabilities, the learning hypothesis often (implicitly) assumes that the presence of feedback should correct representational biases and, as a consequence, promote optimal (i.e., expected value maximization) choices (Fantino & Navarro, 2012; Newell & Rakow, 2007); note that this is not true when the probabilities of the outcomes are extreme and the options are not sufficiently sampled: two conditions ensuring that the experienced and the actual frequency of the outcomes converge; (Esponda et al., n.d.).

An alternative category of accounts, which we shall refer to as the dispositional hypothesis, conjectures that the mere presence of feedback changes the decision-maker’s preferences, independently from any learning process. We identify in the literature two main contributors to the dispositional hypothesis: epistemic curiosity and regret.

Epistemic curiosity pertains to the idea of gaining utility from the resolution of uncertainty (Buyalskaya & Camerer, 2020; Charpentier et al., 2018; Cogliati Dezza et al., 2022; Rigoli et al., 2019; Ruggeri et al., 2023). Indeed, when feedback is available, some options acquire different informational values with respect to the resolution of uncertainty. For instance, when only the outcome of the chosen option is revealed (partial feedback) and the decision features a riskier (high variance) versus a safer (low variance) lottery, choosing the safer (low variance) lottery resolves less uncertainty about the final state of the world. In other words, there is an extra informational incentive to choose the riskier (high variance) option, if partial feedback is provided. This informational asymmetry explains how curiosity − or an uncertainty minimization drive − may shift choices in favor of the risky options if the participant anticipates that the decision will be followed by feedback (Gottlieb & Oudeyer, 2018; Sharot & Sunstein, 2020).

Regret can also cause dispositional effects of feedback when it is expressed as an anticipated emotion, during decision-making (Zeelenberg et al., 1996). The rationale is that, when considering to choose a safe lottery over a risky one, one might forecast the regret elicited by observing a positive resolution from the best alternative outcome (the unchosen, riskier lottery) (Coricelli et al., 2005), and therefore make more risk-seeking choices to avert this possibility. As opposed to epistemic curiosity, anticipated regret should notably emerge when the outcome of the unchosen option is also available (complete feedback). In support of the prominent role of counterfactual emotions in decision-making, regret has recently been invoked to explain certain aspects of choices elicited in paradigms that couple descriptions with feedback (Cohen et al., 2020; Erev et al., 2017; Plonsky & Teodorescu, 2020; Weiss-Cohen et al., 2021).

Critically, in contrast to the learning hypothesis, which supposes that the effect of the feedback should emerge in a gradual way, the dispositional hypothesis conceives that it can emerge even before any feedback is experienced – i.e. by anticipation. Because the two hypotheses differ in the temporal relation with respect to the time of choice and outcomes, fine-grained temporal dynamics can dissociate the two accounts: dispositional effects precede choices while learning effects follow outcomes. In addition, the two hypotheses are also different in their relation to choice optimality: while learning mechanisms, by correcting representational biases, should generally increase expected value maximization, dispositional ones are more silent in that respect.

The goal of the present study is to evaluate the relative merit of these alternative hypotheses, first, in the existing literature and, second, in seven newly conducted experiments (N = 538). The experiments have been designed to systematically investigate the role of feedback in decision-making under risk across different experimental manipulations (presence versus absence of instructions; partial versus complete feedback) that allow discriminating competing accounts. Our findings consistently challenge the learning hypothesis in favor of dispositional mechanisms, while suggesting that both curiosity and regret play a role under different feedback information regimens.

Literature Review

In this section, we critically review studies of decisions under risk among fully-described gambles, which features decision problems opposing a risky and a sure option, and which compare choices in the presence versus absence of feedback. With these criteria, we identified 9 studies that otherwise differed greatly in the composition of the decision problems (probabilities and outcomes), in the number of repetitions of each decision problem, in the number of subjects and in the general design (between- versus within-subjects manipulation of the presence of feedback; see Table 1; Table S1 for more details).

To inform our research question, we focus on 1) whether the presence of feedback affects decisions under risk in a systematic direction, with regard to risk preferences, and, 2) when applicable, whether the presence of feedback improves decisions, in the sense of increasing the capacity of participants to make expected-value maximizing choices.

First, we reviewed the effects of feedback on risky choice rate (i.e., the propensity to choose the risky over the sure option; hereafter referred to as R-rate). Feedback was found to increase R-rates by 8% on average. In total, 6/9 studies reported an effect in this direction (+ 13%), while the other three studies reported a marginal effect in the opposite direction (‒1%). This effect seems consistent across gain and loss framings: in the gain framing, feedback increases R-rate in 5/8 studies (+ 8%) and decreases R-rates in 3/8 studies (‒3%). In the loss framing, feedback almost always increases R-rates (+ 17% over 4/5 studies) -with Erev 2017 being the only exception. Importantly, the increase in R-rates due to feedback seems to be conditional on the objective probability associated with the best outcome of the risky option. Notably, when the risky option gives the best outcome with high probability feedback always increases R-rates (+ 24% over 7 studies including a high-probability condition). In contrast, in low probability conditions (the best outcome of the risky option happens with low probability), feedback decreases R-rates in most studies (‒14% over 5/6) (Tables 1 & 2; Table S2 for more details).

To assess the plausibility of the learning account of feedback effect on risk attitude, we then reviewed the effect of feedback on the fraction of expected-value maximizing choices (hereafter referred to as O-rate – standing for optimal choice rate). Note, however, that this cannot be assessed in the majority of the studies because, in many cases (5/9 studies), the options had equal EVs. Moreover, three out of the four applicable studies fell short of orthogonalizing key dimensions of options (such as risk, probability and expected value) and thus the following results should be treated with cautiousness. In general, there is seems to be an absence of a systematic effect of feedback on O-rates with the majority (3/4) of the studies exhibiting a marginal effect (increase or decrease) of 1% -and Weiss-Cohen 2016 being the only exception with an increase of 11% in presence of feedback. Despite the absence of a general systematic effect of feedback on O-rates, feedback does increase O-rates when the EV-maximizing option gives the best outcome most of the time (+ 11% over 4 studies) and decrease O-rates when this condition does not hold (-11% over 3 studies) (Tables 1 & 2; Table S2 for more details).

Our critical literature review points toward a relatively consistent effect of feedback on increasing risk-taking. Most of the literature has adopted, implicitly or explicitly, the intuition that feedback acts through the updating of, otherwise distorted, subjective probabilities (learning hypothesis) (Jessup et al., 2008; Marchiori et al., 2015; Yechiam & Barron, 2005). Yet, two studies which report results coherent with this general picture, adopted a specific experimental design choice which, when taken into account in the interpretation of the results, challenges this view (Josephs et al., 1992; Rigoli et al., 2019). By revealing the effects of feedback in the context of unique (one-shot) decisions set-up, which prevents an option-specific learning effect, these studies raise the possibility of a more general dispositional effect of feedback, for instance in the form of curiosity (Rigoli et al., 2019). Furthermore, the fact the effect of feedback is generally stronger for high-probability risky options is consistent with a form of regret avoidance, understood as the drive to choose the option that gives the highest possible outcome most of the time (Cohen et al., 2020; Erev et al., 2017).

To sum up, a critical survey of the available literature does not allow clear-cut conclusions concerning the psychological mechanisms of the effect of feedback and their behavioral consequences. The effects of feedback on risky or optimal choice rate were not systematic in direction and often quite negligible in size. Our assessment of the literature also allowed us identifying several – frequent – methodological shortcomings that limit the questions that can be addressed. More specifically, we noted that experimental designs often fell short of orthogonalizing key dimensions of options, such as risk, probability and expected value and to control for an equal number of trials with and without feedback.

Experimental design

To address our research question, while eluding the shortcomings identified in the literature, we ran a series of six online (N = 100 for each experiment before the application of strict exclusion criteria ‒ see Methods) and one laboratory incentivized experiment (N = 30). The six online experiments were variants of the experimental paradigm that we will describe below.

One novelty of our experimental design is using a factorial design with a within-subjects manipulation of post-choice feedback (present or absent); feedback was treated as a between-subjects factor in most of the previous studies, with only one exception (Erev et al., 2017). Furthermore, unlike most of the previous experiments, we included the same number of trials (10) in both feedback conditions, to disentangle between the effect of mere repetition and the effect of feedback itself (Couto et al., 2020). Trials featuring the same decision problem were clustered in blocks of 10 trials. Feedback and no-feedback blocks were randomly interspersed.

We used a binary choice task featuring a sure and a risky option in each trial. The risky option had the form of (p,m; 1-p,0), namely giving m points with probability p and zero points otherwise. In addition to feedback (present or absent), we also factorially manipulated the choice optimality, i.e. whether the risky or the sure option has a higher expected value (EV): in one condition, the risky option maximizes the EV; in the other condition, the sure option maximizes EV. This allowed to orthogonalize risk preference and decision optimality ‒two features that have been often confounded in the literature. Finally, we manipulated within-subjects the probability of gain associated with the risky option (three levels, namely 0.1, 0.5, and 0.9), and the magnitude of the risky option (two levels, 40 and 60 points), thus leading to a decision space of 12 unique decision problems. Together with our feedback/no feedback manipulation and the repetitions (×10 per decision context), our final factorial design comprised 240 choices per subject, which is larger than that commonly found in the literature.

The role of feedback and instructions on risk preferences

The first experiment (Exp.1) featured partial feedback i.e., we revealed only the outcome associated with the chosen option. Participants were not informed about the presence or absence of the feedback before starting a given block. Dependent variables, i.e., the propensity to choose the risky option (R-rate) or the optimal – EV maximizing – option (O-rate) were analyzed using a generalized linear mixed effect (GLME) model with the task factors (presence of feedback and option optimality) as independent variables (see Methods for more details).

Our analyses of the R-rate identified a significant main effect of feedback (P < 0.001; Table 3.A), which was characterized by an increased propensity to choose the risky option when trial-by-trial feedback was present (Fig. 2A; Table 4.A). Interestingly, this increase was of the same size and direction both when the risky option was better and when the sure option was better (with respect to EV) (Table 4.D), such that there was no detectable main effect of feedback on the optimal choice rate (Tables 3.B & 4.B; Fig. 2B). In other terms, there was a significant interaction between feedback and option optimality: feedback increased the O-rate when the risky option was the EV-maximizing one, but decreased it otherwise (interaction P < 0.001; Table 3.B).

Finally, we examined the trial-by-trial unfolding of the main effect of feedback on R-rate. The learning hypothesis predicts that the effect of feedback should be absent at the first trial, then gradually emerge and increase after the repeated experience with feedback. Our analyses revealed a slightly different pattern: while indeed no significant effect of feedback could be detected in the first trial (P > 0.05; Fig. 2C; Table 4.C), R-rate in feedback and no-feedback conditions abruptly diverged in the second trial and the difference remained constant until the end of the block (P < 0.01 in all trials).

Overall, the results of this first experiment appeared, at the macroscopic level, in line with most of the existent literature generally showing an increase of R-rate in the presence of feedback. At a finer grain level, because the effects develop after the first feedback, they seem overall consistent with a learning effect. The fact that the effect seems abrupt rather than gradual could be rationalized as one-shot learning. However, because participants in Exp.1 started each block without knowing whether they would receive feedback or not, the first trial also implicitly but unambiguously informed them about the presence of feedback in the ongoing block (feedback or no-feedback), which may have altered their disposition toward risky options. Thus, although the separation of R-rates in the second trial can be a result of (one-shot) learning, it can also reflect the triggering of a dispositional change. Of note, also against the learning hypothesis is the fact that the presence of feedback did not improve the optimal choice rate.

To disentangle these two possibilities, we ran a second experiment in which, at the beginning of each block, participants received explicit instructions (hence block instructions) mentioning whether they will receive post-choice feedback in the upcoming block or not. Everything else was kept the same as in Exp.1.

At the aggregate level Exp.2 replicated Exp.1 in all respects (Fig. 2D and 2E; Tables 3 & 4). Yet, the between-experiment manipulation of the block instruction produced a significant difference in the effect of feedback on first-trial R-rates (P < 0.01; two-sample t-test; inset of Fig. 2F). Actually, the difference in R-rates between feedback and no-feedback blocks in Exp.2 arose from the very first trial -and remained significant for the rest of the block (P < 0.01; Table 4.C; Fig. 2F). Thus, it seems that the mere anticipation of feedback information induced by the block instructions was enough to change risk preference before any feedback was actually experienced.

In summary, results from Exp.1 and Exp.2 clearly revealed that the presence of feedback about the outcome of the chosen lottery increased risk propensity but not choice optimality. Besides, while Exp.1’s results only superficially supported the learning hypothesis, the results following the introduction of explicit feedback instruction in Exp.2 favor the dispositional hypothesis. Overall, this pattern of results is consistent with a dispositional effect created by epistemic curiosity, where the demand for uncertainty resolution increases risk propensity because of the informational asymmetry between the risky and sure lotteries.

Curiosity cannot be the only determinant of the effect of feedback on risk preference: a role for regret

Exp.1 and Exp.2 featured a partial feedback regimen and since the result of the sure option is always known by definition and the result of the unchosen option is not disclosed, choosing the risky option provides the participant with more information about the current state of the world. If epistemic curiosity is the only driver of the observed effect, the informational asymmetry between the sure option (whose result can be inferred with certainty in absence of feedback) and the risky one (whose result cannot be inferred with certainty in absence of feedback) causes the increased risk-taking propensity in the presence of feedback. Thus, according to the epistemic curiosity account, the effect should vanish (or, at least, decrease) under a complete (or ‘full’) feedback regimen, i.e., when the forgone outcome of the unchosen option is additionally revealed. To test this hypothesis, we ran Exp.3 and Exp.4, which were analogous to Exp. 1 (without block instructions) and Exp.2 (with block instructions) except for the fact that they both featured complete feedback.

The complete-feedback experiments replicated the main effects observed in their partial feedback counterparts (Exp.1 and Exp.2). Most importantly, the presence of complete feedback still increased the R-rate (Tables 3.A & 4.A), and had no effect on the O-rate (Tables 3.B & 4.B).

The pattern of results at the level of the trial-by-trial dynamic was also replicated. In Exp.3 (without block instructions), the divergence induced by the presence versus absence of feedback was detectable from the third trial and remains significant for the rest of the block (P < 0.01 from 3rd trial onwards; Fig. 3A; Table 4.C). In Exp.4 (the one with block instructions), contrary to the idea of epistemic curiosity being the sole determinant of the change in risk propensity between feedback and no feedback conditions, we found an effect from the first trial. The R-rates in the feedback blocks are significantly higher than the no-feedback ones starting from the first trial and throughout the block (P < 0.01 in all trials; Fig. 3Β; Table 4.C). As in Exp.1 versus Exp.2, the between-experiment manipulation of the block instruction produced a significant difference in the effect of feedback on first-trial R-rates in Exp.3 versus Exp.4 (P < 0.01; two-sample t-test; inset of Fig. 3Β).

While these results are once again consistent with a feedback-induced dispositional change in risk preference (the effect arose before any feedback is received), they are not easily accommodated by the epistemic curiosity account, because, under the complete feedback regimen, there is no uncertainty resolution utility bonus attributable to choosing the risky option. An alternative psychological mechanism for this effect that is compatible with the complete feedback scenario is the anticipated regret. To understand why anticipated regret could represent a possible explanation for this effect, it should be first noted that in many economic decision-making settings, regret is generally thought to be dependent on a comparison between the obtained and the forgone outcome and that this systematic comparison is possible only in the complete feedback condition, where regret is experienced whenever the forgone outcome is higher than the obtained one. However, a regret-minimizing account demands that the effect of feedback on risk preference should interact with the probability of obtaining the high-value outcome from the risky option (Cohen et al., 2020; Erev et al., 2017). This is because, if the highest-value outcome of the risky option (hence referred to as the best-risky outcome) is rare (say 10%), picking the sure option actually minimizes regret 90% of the time. The converse is true if the best-risky outcome is frequent (e.g. 90%): in this case, choosing the risky option minimizes regret 90% of the time. Thereby, to assess the anticipated regret hypothesis, we evaluated the effect of feedback as a function of the probability of the best-risky outcome (10%, 50% and 90%) and of the type of feedback (partial and complete). This analysis revealed a clear interaction (Fig. 3E; P < 0.01; Table 3.C), which was driven by the effect of feedback increasing as a function of the probability of the best-risky outcome in the complete feedback experiments (P < 0.001; Table 3.C) but being stable in the partial feedback experiments (P > 0.05; Table 3.C).

Extending the results to moderate risk options?

Next, we attempted to clarify the psychological mechanisms involved in this effect. The fact that, in all experiments, the sure option is systematically a certain prospect leaves open the possibility that the effect of feedback is idiosyncratic to this framing. Indeed, certainty effects are known to heavily weigh decisions and to create robust paradoxes (Tversky & Kahneman, 1986) -yet, Plonsky et al. showed that their findings stay essentially unchanged when adding noise in sure lotteries in experience-only decisions (Plonsky & Teodorescu, 2020). In the next two experiments, we therefore assessed the robustness of our results to variation in task conditions, specifically to contexts where the non-risky option is not certain. To do so, we designed Experiment 5 and Experiment 6, where we substituted the sure option (which gives a specific amount with certainty) with a 50%-50% low variance lottery with EV equal to the one of the sure options (Fig. 1C). This new option remains relatively safe (given its low variance), yet now features an uncertain outcome. We shall refer to this option as the safe option, to differentiate it from both the sure and the risky. All other things considered (i.e. except for the sure options being substituted with the corresponding safe ones), Exp.5 and Exp.6 were respectively indistinguishable from Exp.2 (partial feedback) and Exp.4 (complete feedback) (Fig. 1). Consolidating our conclusions, all the main results identified in Exp.1–4 were replicated in this modified setup. Notably, the presence of feedback increased risk-taking from the first trial significantly in the partial feedback Exp. 5 and numerically in the complete feedback Exp.6, (Table 4.C), and this dispositional effect interacted with the probability of the risky option in the complete feedback condition (Table 3.D). This result illustrates that our key findings are not idiosyncratic to some design choices, and might therefore reflect a generalizable psychological effect. Leveraging this robustness, we completed our demonstration by a comprehensive assessment of our main claims, evaluated over our six experiments. This analysis confirmed that the dispositional effect induced at the first trial was robustly elicited in the experiments featuring feedback instructions (Exps 2,4,5,6; Fig. 4B) and vanished in the absence of the said instructions (Exps 1,3; Fig. 4A). Consistent with different psychological mechanisms operating under partial or complete feedback regimens, the effect of feedback was identical across all levels of the probability of the risky option in the partial feedback experiments (Exps 1,2,4; Fig. 4C), while significantly modulated by this factor in the complete feedback experiments (Figs. 4D & 4E). Finally, if anyting when pooling all the experiments, contrary to the more natural instantiation of the learning hypothesis, we found a small negative impact (~ 1%) of the presence of feedback on the optimal choice rate (Table S3).

Extending the results to the loss domain

Having robustly established our results in six web-based experiments focusing on gain prospects, we proceeded to conduct a final experiment (Exp.7) with significant variations in these two core characteristics. First, it was performed in the lab (N = 30), which allowed us to further probe the robustness of our conclusions in a more controlled set-up, while concomitantly using higher incentives (Exp.1–6: £5.72(0.1); Exp.7: €14.4(0.35)). Second, Exp.7 featured a valence manipulation: prospects were framed either as potential gains or losses, in lieu of the risky-better vs. risky-worst dimension. Keeping in line with the previous experiments, we manipulated the probability of the risky option (three levels, namely 20%, 50%, and 80%), and the magnitude of the risky option (three levels in the gain, three in the loss domain) (Fig. 5A). Finally, Exp.7 featured partial feedback and block-wise instructions, such that participants knew before starting each block whether or not they would receive feedback in the upcoming block.

Exp.7 once again confirmed a positive effect of feedback on the rate of risky choices (P < 0.001; Fig. 5B; Tables 3.E & 4.A). Notably, valence did neither have a main effect on the percentage of risky choices nor interact with feedback (P > 0.05; Table 3.E; Fig. 5C). Note that the design of this experiment (the two options having equal EVs) renders the analysis of optimal choice rate irrelevant.

Furthermore, supporting the dispositional effect hypothesis, the difference in risky rates between feedback and no-feedback blocks was detectable from the very first trial (P < 0.01; Fig. 5D; Table 4.C). Thus, Exp7 confirms that the anticipation of feedback (induced by the block instructions) changes the behavior before any feedback is actually experienced, and generalizes this result to decision contexts involving losses as well as higher incentives.

Feedback-induced trial-by-trial adjustments

Having evidenced feedback-induced dispositional effects on risk preferences does not rule out that additional feedback-induced learning processes co-exist. However, feedback-induced learning processes may not be apparent when looking at average risky choice rate, because their effect depend on the previous trial choice and outcome. To investigate possible learning effects in trial-by-trial dynamics, we therefore analyzed the probability of repeating a risky choice as a function of the outcome received in the previous trial. The logic of this analysis is that virtually any instantiation of a learning process would induce a “positive recency” effect, meaning that the probability of repeating a risky choice should increase after receiving the best possible (non-zero) outcome, compared to receiving the worst possible (zero) outcome (Fig. 6A, left) (REF). We tested this hypothesis by analyzing this behavioral variable (probability of repeating a risky choice, p(R_t|R_t−1)) across all datasets. The results are in sharp contrast with the learning processes predictions (Fig. 6B). In fact, the probability of repeating a risky choice was lower after receiving positive feedback (means p(R_t|R_t−1=0) = 0.69 ± 0.27, p(R_t|R_t−1>0) = 0.62 ± 0.26, paired-difference test P < 0.001). The analysis of trial-by-trial dynamics thus show no support for any form of feedback-induced learning process, and rather strictly falsifies it. The observed behavioral pattern exhibits in fact negative recency, which is better understood as a manifestation of the gambler’s fallacy ( in the laboratory Ayton & Fischer, 2004; Teoderescu et al., 2013; Barron & Leider, 2010 and in ecological setting Clotfelter & Cook, 1993), according to which participant would move away from a recently rewarded risky choice because they (wrongly) assume that the subsequent likelihood of positive feedback will be lower (Fig. 6B and Fig. 6C). This gambler’s fallacy interpretation is further confirmed by conditioning this analysis on the probability of the risky outcome (remind, in our task we featured three probability levels: 0.1, 0.5 and 0.9). This actually reveals that the effect is modulated by the underlying outcome probability and is maximal when outcomes are rare (p = 0.1: subjects perceive the likelihood of receiving two positive outcomes in a row lower than reality) and absent when the outcomes are common (p = 0.9; regression analysis revealed a highly significant interaction between the probability of the risky outcome and the outcome in the previous trial P < 0.001; in addition to a highly significant main effect of outcome of the previous trial: P < 0.001). To sum up, not only do we demonstrate that the effect of feedback on risk preference precedes the reception of any feedback (and is therefore better understood as a change in attitude or disposition), but we also disprove any residual role for feedback-induced learning processes in the trial-by-trial dynamics by evidencing biased reactions to probabilistic and stochastic events akin to the gambler’s fallacy (Ayton & Fischer, 2004; Barron & Leider, 2010; Clotfelter & Cook, 1993; Teoderescu et al., 2013).

Checking our findings in previous dataset

We started our investigation by noting some discrepancies in the literature concerning the directionality of the effect of feedback in decision-making under risk, which was otherwise generally understood as stemming from a learning process (Table 1). Over 7 Experiments we found that the presence of feedback increases the propensity of taking risks, with no detectable consequence on the optimal choice rate. By manipulating block-wise instructions (present vs absence) we also found that the effects were mediated by a change of attitude (or disposition) of different nature in the partial (consistent with curiosity) and complete (consistent with regret) feedback condition; trial-by-trial dynamics analysis further ruled out that outcome-based learning play a role in these processes.

To close the loop, we conclude by re-analyzing a previously published dataset that stands out as the one containing the larger sample size among the studies analyzed (N = 446) and the wider spectrum of decision problems (150 decision problems) (Erev et al. 2017). In order to replicate our analyses as comprehensively as possible, we restricted this re-analysis to the decision problems that feature identical or similar properties to ours, namely: decisions opposing a sure to a risky option (to define risky choice rate), decisions involving options with different expected values (to define optimal choice rate) and decisions featuring non-extreme probabilities (excluding 1% or 99%) for the risky option. We also excluded trivial problems decisions in which one option dominates the other (see Supplementary Material for more details about the study and the decision problem selection).

This re-analysis of Erev et al. (2017) data was consistent with our own results on the absence of positive effect of feedback on the optimal choice rate (without feedback 0.66 ± 0.21; with feedback: 0.65 ± 0.17; P > 0.05; Fig. 7A), as well as on the increase in risk choice rate (without feedback: 0.38 ± 0.22, with feedback 0.42 ± 0.18; P < 0.001; Fig. 7B). Having in mind the fact that Erev et al. (2017) featured complete feedback we looked at the effect of feedback specifically for different probability levels (low: prob$\le$0.25, medium: 0.25<prob<.075, high: prob$\ge$0.75) of the risky high-value outcome. As expected by the regret hypothesis, and consistent with our own findings, the effect of feedback monotonically scaled with the risky best-outcome probability (Fig. 7C; low: -0.053±0.317; medium: 0.052±0.198; high: 0.147±0.341). Moreover, we looked at the trial-by-trial dynamics and found a negative recency pattern, consistent with a gambler’s fallacy bias (Fig. 7D). Finally, while the manipulation of instructions was not present as such in Erev 2017, the different orders of presentation of decision problems allowed us to perform an analogous analysis, which further supports a first-trial/dispositional effect (for the details look Supplementary Material/ Erev et al. (2017) re-analysis & Figure S1). Overall, all the behavioral analysis that we could replicate in Erev et al. (2017) lead to similar results and conclusions as the ones performed on our own new data.

In the present study, we aimed at bringing clarity to the effect of feedback on description-based decision-making. This represents a pressing question in behavioral decision-making research as many (if not most) real-life decisions feature both description (in various forms) and feedback (we enjoy – or suffer – the consequences of our choices). Our investigations aimed to address two (related) research questions. First, the directionality of the effect on two main outcome measures, namely risk aversion and value maximization. Second, the cognitive mechanisms underpinning the effect. To address these questions, we devised a series of original experiments designed to overcome some frequent (if not systematic) limitations identified in previous studies.

Indeed, in the surveyed literature, we noted that experimental designs often fell short of orthogonalizing key dimensions of options, such as risk, probability and expected value. Critically, if in a given experiment the risky option always has the highest (or the lowest) expected value, the effect of feedback on choices cannot be attributed to a change in risk preference or to an increase of expected value maximization (our definition of choice optimality) (this is the case for example in Jessup et al., 2008; Lejarraga & Gonzalez, 2011). Similarly, if the risky being the EV-max option depends on the probability level of it (Weiss-Cohen et al., 2016), is not possible to assess whether the putative effect of feedback interacts with the probability level, nor to disentangle the effect of feedback from the effect of probability. By carefully orthogonalizing, the presence of feedback, the expected value (risky or sure-better conditions), outcome probabilities and magnitudes, our experiments allowed us to address our questions of interest while addressing these shortcomings, thereby filling important gaps in the literature.

Regarding the directionality of the effect of feedback on decision-making under risk, we found that the presence of feedback increased the propensity to choose the risky option. Because it falls in line with the majority of studies surveyed in our literature review, we argue that this result can be considered credible, robust and replicable (Erev et al., 2017; Goyal & Miyapuram, 2019; Rigoli et al., 2019; Weiss-Cohen et al., 2016a; Yechiam & Barron, 2005). In light of the strength of our empirical evidence (including several experiments, different levels of outcomes, probabilities, gains and losses and an overall extremely large sample size), we believe that the four reports of marginal effects might be attributed to peculiarities of the design of the related studies -but further investigation would be needed to verify this.

Our experimental design also allowed us to confidently establish that the presence of feedback had no (beneficial or detrimental) effect on the optimal (i.e., EV-maximizing) choice rate: risky choice rate was increased by feedback regardless of whether or not the risky option was more advantageous. This question was somehow overlooked in the literature, as mostof the surveyed studies actually did not allow for testing this effect, either because they featured risky option which was consistently associated with the highest expected value (Jessup et al., 2008; Lejarraga & Gonzalez, 2011), or because they used decision problems where the risky and the safe option had the exact same expected values (Goyal & Miyapuram, 2019; Josephs et al., 1992; Marchiori et al., 2015; Rigoli et al., 2019; Yechiam & Barron, 2005). Using the studies that allow to test this effect (unequal EVs), we essentially replicated our result of an absence of a systematic effect of feedback on optimal choice rate (Erev et al., 2017; Jessup et al., 2008; Lejarraga & Gonzalez, 2011). Of note, once restricted to the subset of decision problems that are relevant to our investigation, a re-analysis confirmed our main results in a large dataset originally published by Erev et al. (2017). Despite the absence of a general effect of feedback on optimal rates, both our data and existing literature strongly support that feedback does lead to higher optimal rates when the EV-maximizing option gives the best outcome most of the time (and to lower optimal rates, otherwise). This is an effect that has emerged previously in the literature even in the absence of descriptions, namely in experience-only decisions (Erev & Roth, 2014).

In addition to disambiguating the directionality (and amplitude) of feedback-induced changes in risk preferences, our study also provides further insights into the cognitive mechanisms underlying these effects. In the literature, the (more or less explicit) standard assumption is that experiencing decision outcomes affect the subjective beliefs characterizing the probability of realization of those outcomes (Aydogan & Gao, 2020; Jessup et al., 2008; Marchiori et al., 2015; Yechiam & Barron, 2005). In other terms, the effects of feedback are traditionally conceived as the result of a learning process due to the sampling experience. Although predominant, the learning hypothesis has rarely been empirically challenged and compared to plausible alternatives, one of which being that the presence of feedback changes the disposition of a subject to make a risky decision. This dispositional change can naturally be induced by the anticipation of the (informational and emotional) state that results from receiving the feedback. In order to evaluate the merit of this alternative hypothesis, we designed a new experimental manipulation that consisted in disclosing (or not) whether the upcoming block of trials would feature explicit feedback. This manipulation allowed us to reveal a simple but unambiguous behavioral signature of a dispositional change: when subjects were informed about the presence of feedback, its effect was present in the first trial, in decisions that preceded the disclosure of the first feedback. The first trial effect was significant in 3 out of 4 experiments featuring block-wise instructions. In Exp. 6, the effect was not significant at conventional statistical threshold but was still in the same direction (higher risk rate in the first trial when feedback was announced), and it was strongly present when averaging all experiments. This result unambiguously falsifies the learning hypothesis as a sole determinant of the effect of feedback on risky decision-making, given that the effect of feedback is present before any actual learning could occur.

Several cognitive processes or psychological motives can actually underpin this dispositional effect toward feedback. Our results, when restricted to the experiments featuring partial feedback (i.e., only the outcome of chosen option was presented) are consistent with a curiosity-driven dispositional change (Loewenstein, 1994): since the outcome of the sure option is known with certainty before making a choice, choosing the risky option is the only way to resolve the uncertainty characterizing the outcome of the whole decision situation. These results are in line with the epistemic curiosity literature that shows that subjects attribute a positive utility and actively seek uncertainty resolution (Buyalskaya & Camerer, 2020; Charpentier et al., 2018; Cogliati Dezza et al., 2022; Rigoli et al., 2019; Ruggeri et al., 2023). However, while providing a satisfactory interpretation of the effects on risky decisions in the partial feedback condition, curiosity-driven motives cannot account for the fact that these effects persist under completed feedback conditions (i.e., when the outcomes of both the chosen and the unchosen options were presented). In order to shed light on this incongruity, we analyzed the effect of feedback as a function of the probability level of the risky option. This analysis revealed that, while the effect of feedback seemed undistinguishable in the partial and complete feedback experiments at the aggregate level, clear differences emerged when splitting across probability levels. More specifically, the feedback-induced increase in risk-taking was a monotonic function of the probability of the risky option, a pattern consistent with the idea that increased risk-taking in the complete feedback experiments is induced by the willingness to reduce the chance of experienced regret (Gilovich & Medvec, 1995). The rationale underlying this interpretation can be broken down into two main steps. First, regret is typically defined – in the context of value-based decision-making – by the difference between the obtained and the forgone outcome, and is, therefore, more salient in complete feedback environments (where this comparison can actually be directly observed). Second, the risky option minimizes the chance to experience regret, specifically when the probability associated with the best possible outcome is high. Although explaining our current results thereby requires assuming two different psychological processes operating in the two informational regimens, we note that regret and curiosity have been shown to interact in other experimental (and real-life) situations (Caldwell & Burger, 2009; Shani & Zeelenberg, 2007; van Dijk & Zeelenberg, 2007). Further research will be needed to better characterize the relation (cooperation or competition) between these two motives. Our results add up to the behavioral literature, spanning from reinforcement learning to behavioral economics, showing that partial and complete feedback situations may elicit radically different mental processes (Klein et al., 2017; Li & Daw, 2011; Weiss-Cohen et al., 2021). Of note, as far as the analyses could be replicated, the results of Erev et al. (2017), which only featured complete feedback, also indicated (a somehow stronger) interaction between the effect of feedback and the probability of the risky option, suggestive of a regret minimization. In addition, even if in these data a proper instruction-induce first trial effect could not be tested, further, in-depth, analysis of risk propensity as a function of the task structure was also largely consistent with a feedback anticipation-induced dispositional change toward risk-seeking (results presented in the Supplementary materials/).

Positive evidence of a dispositional influences (as manifested by the first-trial effect) did not rule out that learning processes could occur in parallel and influence choices. To assess this possibility, we looked at feedback-induced trial-by-trial choice adjustments. Specifically, we reasoned that virtually all instantiations of learning hypothesis, be them rooted in Bayesian or Reinforcement learning (Erev & Haruvy, 2015), should exhibit positive recency and thus predict that the probability of choosing a risky option should increase following a rewarded risky choice. In striking contrast, our analysis revealed a negative recency pattern: if anything, a positive feedback in the preceding trial reduced the chance of repeating a risky option. This prima facie puzzling behavioral effect can be understood as manifestation of the what is called the gambler’s fallacy, i.e., the fact that human subjects tend to misrepresent the independence of probabilistic Previous literature examining the gambler’s fallacy in decisions involving both descriptions and experience has given mixed results (with Ayton & Fischer, 2004 and Teoderescu et al., 2013 providing support in favor of the effect, while Plonsky & Teodorescu, 2020 provide data which are inconsistent with it). Our re-analysis of Erev et al. 2017 data provides clear support in favor of a negative recency/gambler’s fallacy effect. This interpretation is further confirmed by the fact that this effect interacted with the probability of the risky outcome, such that the negative recency effect was stronger when the probability of the risky outcome was lower: a situation where the gambler fallacy intuition dictate that two consecutive lucky strikes appear almost impossible. These analyses have deeper implications because, not only rule out the possibility that residual feedback-induced learning effects are at play, but also suggest that presence of a pre-existing - explicit description of the probabilistic process may create prior (biased) expectations that prevent learning processes.

Many real-life situations involve decisions based both on descriptions and experience. A customer deciding to buy a product based on reviews and on her experience with it, a doctor prescribing medicines based on published efficacy as well as on her practical experience, a citizen deciding to pay her taxes (or not) based on the official deterrence policy (frequency of checks and magnitude of fines) and on her experience from previous years are all scenarios of repeated choices involving both descriptions and experience, and where the effect of feedback can therefore have consequential effects. Notably, delayed feedback was shown to significantly increase tax compliance compared to immediate feedback (Kogler et al., 2016), which could constitute an ecological validation of our results, under the assumption that delayed feedback incurs similar effects as no feedback and that tax compliance (resp. evasion) constitutes the safe (resp. risky) option.

To conclude our findings shed new lights on the behavioral effects of feedback in description-based decisions under risk, and on their underlying psychological mechanisms beyond learning. Because of the ubiquity of those situations, elucidating the effect of feedback in description-based scenarios can improve our understanding of apparent decision anomalies relevant for many real-life situations and give us the opportunity to improve our policies. In particular, our results suggest that, contrary to what common-sense could dictate, providing feedback cannot be considered as the one-fit-all behavioral intervention to improve decision-making (defined as expected value maximizing), notably because the effects of feedback are at least partially mediated by dispositional changes rather than purely driven by learning processes. Our results add up to an increasing body of evidence highlighting complex interactions between description- and experience-based choices that are currently not well accounted by standard models (Erev et al., 2022; Garcia et al., 2023).

Participants

For the six online experiments, we recruited a total of 620 participants (4x100 for Exp1-4, 104 for Exp5, and 116 for Exp6 | 300 females, 300 males, 20 sex not available | aged 29.29 $\pm$ 9.24 years) from an online platform (www.prolific.com). The research was carried out following the principles and guidelines for experiments including human participants provided in the Declaration of Helsinki (1964, revised in 2013). The INSERM Ethical Review Committee/IRB00003888 approved the study on 13 November 2018, and participants were provided written informed consent before their inclusion.

For the laboratory experiment, 30 healthy participants completed the experiment (18 females | aged 28 $\pm$ 7.22). Given Bellemare et al. (2014) assert that 20 subjects are needed in a within-subject analysis to achieve a power of 80%, a sample size of 30 is expected to have enough power to detect any effect. The choice of model to analyze the dataset, the generalized mixed effect model, also requires a smaller sample size than traditional ANOVA. Participants were contacted via the "Relais d’information en sciences de la cognition" (RISC), part of the French "Centre national de la recherche scientifique”. Participants enroll on the platform and as such, voluntarily accept to be contacted for scientific studies. Participants were recruited on the basis of their good understanding of French. They received an email in their mailing list containing a link to a questionnaire asking them for general information. After completion, they are contacted by the experimenter to agree on the day of the experiment. The experiment was approved by the local Ethics Committee of Ecole Normale Supérieure, Paris. Participants which, on average, lasted 45 minutes.

Exclusion criteria

To ensure the high quality of the data of the online experiments, we applied the following exclusion criteria:

participants with a missing trial or with a repetition of a trial in the test phase
participants with two or more submissions of more than 100 out of 270 trials
participants with less than 9 (out of 10) correct answers in the catch block
(see below for more information on the catch block)
excessively long completion time [two standard deviations more than the average]
right or left option > 95% of the time [low quality data] - note that the position of the options was counterbalanced, so this behavior is aberrant

After applying the above exclusion criteria, we were left with a total of 508 participants for the main analyses (80, 95, 86, 85, 80, 82 for experiments 1–6 respectively).

For the model-fitting analyses, we also excluded participants with extreme risk attitudes (risky or safe option > 95% of the time) that would lead to poor fitting. Thus, for the model-fitting analyses, we were left with a total of 487 participants (77, 92, 84, 78, 76, 80 for experiments 1–5 respectively).

No participants were excluded from the analysis of the laboratory experiment.

Incentives

For the online experiments, participants received a fixed compensation of £3 for about half an hour of engagement (average completion time in minutes: 28.47 $\pm$ 8.68). Additionally, we incentivized participants to reveal their true preferences by offering a monetary bonus determined by the outcome of a randomly selected trial of the testing phase (average bonus won in British pounds: 2.68 $\pm$ 2.21).

For the laboratory experiment, participants received a show-up fee of 10€ for an average engagement of 45 minutes. To motivate the revelation of true preferences, an incentive system was settled on the basis of hypothetical gains or losses. Each decision gave a pay-off in points — a draw from the selected option payoff distribution. They were told that their goal was to maximize their number of points in the gain domain and to minimize the number of losses. The total of points would determine their final payoff. The conversion rate between experimental units (EU) and euros was set according to the maximum and minimum possible payoffs at 0.02€/EU. The payoff structure was determined such that no participant would incur a loss. On average, they received 15€. At the end of the experiment, participants received and signed the reception of their payments.

Experimental design

The online experiments started with participants giving their consent to participate, followed by detailed instructions on the behavioral task, the structure of the experiment, and the compensation. Afterward, participants went through training which was a mini version (four blocks of five trials each) of the actual experiment featuring similar yet not identical decision problems to the ones of the actual experiment.

Then, participants started the actual experiment which consisted of 24 blocks (12 decision problems with and without feedback, as described in the main text) and a catch block (see below) in the middle of the actual experiment. The 24 blocks were comprised of 10 trials each and were randomized within and across participants. The actual experiment was divided into three sessions between which the participants were given the opportunity to take a self-paced break.

Each block started with a screen providing block instructions about the presence or absence of feedback in the upcoming block (Exps 2, 4, 5, 6) or just prompting the participants to start the block by clicking on the “Start Block” button (Exps 1 & 3). This step was self-paced. Then, the two options (their magnitude and probability) were presented side-by-side, with a clickable white square below each option. The position of the options (left or right) was randomized. Also, the relative vertical position of the magnitude and the probability (magnitude above and probability below or vice versa) was randomized across participants (but was constant within participants). Participants could make their choice at their own pace by clicking the white square below their preferred option. The outcome of the risky (or the safe) option was determined by an independent random draw –by definition, the outcome of the sure option was fixed. After the choice was made and the outcomes were determined, the outline of the selected square/option was highlighted and inside the white square, the outcome of the chosen option was revealed (showing the obtained points) or hidden (showing a question mark) for 1500ms. In experiments where complete feedback was used, the forgone outcome was revealed as well (showing the points of the unchosen option) or hidden (showing a question mark) in a light gray font (in contrast to the standard black font for the obtained outcome). Then, the next trial started showing the same options (potentially in a different position). At the end of the block a screen marking the end of the block was presented for 1500ms.

In the middle of the actual experiment, a catch block featuring the trivial choice between a sure option (probability = 100%) giving 5 points and a sure option giving 30 points was presented. The actual block was identical to all the rest blocks of the actual experiment. The related exclusion criterion (participants that scored below 9/10 correct answers in the catch block were excluded from the analysis) enabled us to ensure that participants understood and paid attention to the task.

At the end of the online experiments, participants were informed about the randomly selected trial and the associated monetary bonus, about their total compensation and they were redirected to the recruiting platform (Prolific) to formally complete their participation.

The laboratory experiment was conducted at the laboratory of Département des Etudes Cognitives at Ecole Normale Supérieure, Paris. Participants were told their rights and gave their consent. They completed a mathematical questionnaire assessing their ability to make multiplication. The experimenter explained the task using a visual support and read an example of a gamble that might be encountered by the decision maker to ensure that they understand the possible outcomes of the gamble. Participants were told to maximize their gains and minimize their losses.

On the computer, participants were presented with an instruction screen indicating the type of information that they would be facing in the block. Individuals read “réalisation révélée essai par essai - feedback provided after each trial” or “réalisation cachée - feedback not provided” and were informed that, at the end of the block, they would be told the number of points gained or lost at that block. After reading the instruction, participants were presented with two options side by side on the screen. Each was associated with a geometrical shape and two labels: one gave information on the probability of occurrence of the outcome and the other, on the magnitude of the outcome. The shapes and their color varied randomly throughout the experiment and across participants. Colors and the brightness of the screen were adjusted to prevent eye fatigue and for easy reading. Labels describing the magnitude and the probability of each lottery figure were above and below the shapes. For half of the participants, the probabilities are always above and the magnitude below. For the other half of the sample, the labels are reversed. The position of the risky and the sure lotteries was randomized within and across blocks. Participants could not anticipate their position and therefore have to stay attentive during the sequential choices. Choices were made using a mouse. After each one, an arrow indicated the localization of the choice, disappeared after 500ms to be replaced by a text at the place of the label of the magnitude of the chosen lottery. The text indicated either the number of points gained or lost (e.g : -32pts) or hid them (XX pts) –as in the online experiments, the outcome of the risky lottery was determined by an independent random draw. The disclosure of the points depended on the type of information provided at the beginning of the block. 1.5s after, another screen appeared with the second round of choice providing exactly the same gamble as before. At the end of the sequence of 10 choices, the accumulated points were shown on the screen.

Most steps of the task were self-paced such that the participant could read and take as much time as needed for the instruction, the choice and the bonus screens. The experiment was divided in three sessions to avoid fatigue. Participants were given feedback on their total points accumulated at the end of each session. At the end of the experiment, their payoff was computed on the basis of their total points.

Partial feedback means that the outcome only of the chosen option is revealed. Complete feedback means that the outcome of both the chosen and the unchosen option is revealed.

Decision Problems

By experimental design, the magnitude and the probability of the risky option were determined – and by definition the probability of the sure option was determined too. Given these, we computed the magnitude of the sure options, for Experiments 1–4, so that the EV difference of the risky and the sure option was at 5%. Namely,

$${mag}_{sure}:= (1+ (1-2\text{*}\text{r}\text{i}\text{s}\text{k}\text{y}\text{B}\text{e}\text{t}\text{t}\text{e}\text{r}\left)0.05\right){mag}_{risky}{prob}_{risky}$$

In the end, we rounded the mag_sure towards the direction that agrees with the riskyBetter factor (if riskyBetter = 1, we rounded mag_sure with the floor and if riskyBetter = 0 with the ceiling function).

Safe options (Experiments 5 & 6) had also defined probabilities (50/50) and were set such that ${EV}_{safe}={EV}_{sure}$. Additionally, one of the two magnitudes was set equal to the EV_risky, so that both outcomes of the safe option were above (below) EV_risky when riskyBetter=0 (= 1). So, the two outcomes for the safe option satisfied the following:

${mag}_{safe1}={EV}_{risky}$ & ${mag}_{safe2}={2EV}_{sure}- {mag}_{safe1}$

In the laboratory experiment (Experiment 7), the EV of the two options was set to be equal. Hence, the magnitude of the sure option was determined given the magnitude and the probability of the risky one. Rounding to the closest integer was used in this experiment.

Statistical analysis

We analyzed two dependent variables: risky choice (1 if one chooses the risky option and 0 if one chooses the sure/safe one) and correct choice (1 if one chooses the Expected Value-maximizing choice; 0 otherwise). We ran a regression analysis using a Generalized Linear Mixed-Effects model (GLME). We used the canonical link function for response variables with the binomial distribution, namely the logit function. The method for estimating model parameters was maximum pseudo-likelihood (MPL). The analysis was run with MATLAB using the fitglme function. The predictors were feedback (absent = -1, present = 1), riskyBetter (no = -1, yes = 1), the probability of the risky option (.1 = -1, .5 = 0, .9 = 1), the magnitude of the risky option (40 = -1, 60 = 1) and trial (from 1 to 10). All predictors were treated as continuous variables. We included main effects and pairwise interactions between these predictors. Assuming idiosyncratic behavior at the level of the participants, we followed the standard practice of using the maximal random effects structure [Barr et al., 2013]. Thus, we tested all the terms both for fixed and random effects.

Modeling

We also submitted our data to a modeling analysis which involved a standard descriptive behavioral model ‒a variant of cumulative prospect theory (Tversky & Kahneman, 1992). Details about the model, the fitting procedures following standard guidelines (Palminteri et al., 2017; Wilson & Collins, 2019), and the resulting statistical analyses can be found in the Supplementary Materials.

Fundings

SP is supported by the European Research Council under the European Union’s Horizon 2020 research and innovation program (ERC) (RaReMem: 101043804), and the Agence National de la Recherche (CogFinAgent: ANR-21-CE23-0002-02; RELATIVE: ANR-21-CE37-0008- 01; RANGE: ANR-21-CE28-0024-01). ML and AN are supported by an ERC Starting Grant 958671 awarded to ML. AN is additionally supported by the Foundation for Education and European Culture (IPEP)

Acknowledgments:

For stimulating and helpful discussion: Enrico Diecidue, Florent Meyniel, Marion Rouault and the members of the Human Reinforcement Learning team of ENS. For providing their data: Ryan Jessup.

Allais, M. (1953). Le Comportement de l’Homme Rationnel devant le Risque: Critique des Postulats et Axiomes de l’Ecole Americaine. Econometrica, 21(4), 503–546. https://doi.org/10.2307/1907921
Aydogan, I., & Gao, Y. (2020). Experience and rationality under risk: Re-examining the impact of sampling experience. Experimental Economics, 23(4), 1100–1128. https://doi.org/10.1007/s10683-019-09641-y
Ayton, P., & Fischer, I. (2004). The hot hand fallacy and the gambler’s fallacy: Two faces of subjective randomness? Memory & Cognition, 32(8), 1369–1378. https://doi.org/10.3758/BF03206327
Barron, G., & Leider, S. (2010). The role of experience in the Gambler’s Fallacy. Journal of Behavioral Decision Making, 23(1), 117–129. https://doi.org/10.1002/bdm.676
Bell, D. E. (1982). Regret in Decision Making under Uncertainty. Operations Research, 30(5), 961–981.
Bellemare, C., Bissonnette, L., & Kröger, S. (2014). Statistical Power of Within and Between-Subjects Designs in Economic Experiments (IZA Discussion Paper 8583). Institute of Labor Economics (IZA). https://econpapers.repec.org/paper/izaizadps/dp8583.htm
Buyalskaya, A., & Camerer, C. F. (2020). The neuroeconomics of epistemic curiosity. Current Opinion in Behavioral Sciences, 35, 141–149. https://doi.org/10.1016/j.cobeha.2020.09.006
Caldwell, D. F., & Burger, J. M. (2009). Learning about unchosen alternatives: When does curiosity overcome regret avoidance? Cognition and Emotion, 23(8), 1630–1639. https://doi.org/10.1080/02699930802472241
Charpentier, C. J., Bromberg-Martin, E. S., & Sharot, T. (2018). Valuation of knowledge and ignorance in mesolimbic reward circuitry. Proceedings of the National Academy of Sciences, 115(31), E7255–E7264. https://doi.org/10.1073/pnas.1800547115
Clotfelter, C. T., & Cook, P. J. (1993). Notes: The “Gambler’s Fallacy” in Lottery Play. Management Science. https://doi.org/10.1287/mnsc.39.12.1521
Cogliati Dezza, I., Maher, C., & Sharot, T. (2022). People adaptively use information to improve their internal states and external outcomes. Cognition, 228, 105224. https://doi.org/10.1016/j.cognition.2022.105224
Cohen, D., Plonsky, O., & Erev, I. (2020). On the impact of experience on probability weighting in decisions under risk. Decision, 7(2), 153–162. https://doi.org/10.1037/dec0000118
Coricelli, G., Critchley, H. D., Joffily, M., O’Doherty, J. P., Sirigu, A., & Dolan, R. J. (2005). Regret and its avoidance: A neuroimaging study of choice behavior. Nature Neuroscience, 8(9), 1255–1262. https://doi.org/10.1038/nn1514
Couto, J., Maanen, L. van, & Lebreton, M. (2020). Investigating the origin and consequences of endogenous default options in repeated economic choices. PLOS ONE, 15(8), e0232385. https://doi.org/10.1371/journal.pone.0232385
Erev, I., Ert, E., Plonsky, O., Cohen, D., & Cohen, O. (2017). From anomalies to forecasts: Toward a descriptive model of decisions under risk, under ambiguity, and from experience. Psychological Review, 124(4), 369–409. https://doi.org/10.1037/rev0000062
Erev, I., & Haruvy, E. (2015). Learning and the Economics of Small Decisions (Chapter 10). In J. H. Kagel & A. E. Roth (Eds.), The Handbook of Experimental Economics, Volume 2 (Vol. 2, pp. 638–716). Princeton University Press. https://www.jstor.org/stable/j.ctvc77b40
Erev, I., & Roth, A. E. (2014). Maximization, learning, and economic behavior. PNAS Proceedings of the National Academy of Sciences of the United States of America, 111(Suppl 3), 10818–10825. https://doi.org/10.1073/pnas.1402846111
Erev, I., Yakobi, O., Ashby, N. J. S., & Chater, N. (2022). The impact of experience on decisions based on pre-choice samples and the face-or-cue hypothesis. Theory and Decision, 92(3), 583–598. https://doi.org/10.1007/s11238-021-09856-7
Esponda, I., Vespa, E., & Yuksel, S. (n.d.). Mental Models and Learning: The Case of Base-Rate Neglect. American Economic Review. https://doi.org/10.1257/aer.20201004
Fantino, E., & Navarro, A. (2012). Description–experience Gaps: Assessments in Other Choice Paradigms. Journal of Behavioral Decision Making, 25(3), 303–314. https://doi.org/10.1002/bdm.737
Garcia, B., Cerrotti, F., & Palminteri, S. (2021). The description–experience gap: A challenge for the neuroeconomics of decision-making under uncertainty. Philosophical Transactions of the Royal Society B: Biological Sciences, 376(1819), 20190665. https://doi.org/10.1098/rstb.2019.0665
Garcia, B., Lebreton, M., Bourgeois-Gironde, S., & Palminteri, S. (2023). Experiential values are underweighted in decisions involving symbolic options. Nature Human Behaviour, 7. https://doi.org/10.1038/s41562-022-01496-3
Gilovich, T., & Medvec, V. H. (1995). The experience of regret: What, when, and why. Psychological Review, 102, 379–395. https://doi.org/10.1037/0033-295X.102.2.379
Gottlieb, J., & Oudeyer, P.-Y. (2018). Towards a neuroscience of active sampling and curiosity. Nature Reviews Neuroscience, 19(12), Article 12. https://doi.org/10.1038/s41583-018-0078-0
Goyal, S., & Miyapuram, K. P. (2019). Feedback Influences Discriminability and Attractiveness Components of Probability Weighting in Descriptive Choice Under Risk. Frontiers in Psychology, 10. https://doi.org/10.3389/fpsyg.2019.00962
Hertwig, R., & Erev, I. (2009). The description–experience gap in risky choice. Trends in Cognitive Sciences, 13(12), 517–523. https://doi.org/10.1016/j.tics.2009.09.004
Hertwig, R., & Wulff, D. U. (2022). A Description–Experience Framework of the Psychology of Risk. Perspectives on Psychological Science, 17(3), 631–651. https://doi.org/10.1177/17456916211026896
Jessup, R. K., Bishara, A. J., & Busemeyer, J. R. (2008). Feedback produces divergence from prospect theory in descriptive choice. Psychological Science, 19(10), 1015–1022. https://doi.org/10.1111/j.1467-9280.2008.02193.x
Josephs, R. A., Larrick, R. P., Steele, C. M., & Nisbett, R. E. (1992). Protecting the self from the negative consequences of risky decisions. Journal of Personality and Social Psychology, 62(1), 26–37. https://doi.org/10.1037//0022-3514.62.1.26
Kahneman, D., & Tversky, A. (1979). Prospect Theory: An Analysis of Decision under Risk. Econometrica, 47(2), 263–291. https://doi.org/10.2307/1914185
Klein, T. A., Ullsperger, M., & Jocham, G. (2017). Learning relative values in the striatum induces violations of normative decision making. Nature Communications, 8(1), Article 1. https://doi.org/10.1038/ncomms16033
Kogler, C., Mittone, L., & Kirchler, E. (2016). Delayed feedback on tax audits affects compliance and fairness perceptions. Journal of Economic Behavior & Organization, 124, 81–87. https://doi.org/10.1016/j.jebo.2015.10.014
Lejarraga, T., & Gonzalez, C. (2011). Effects of feedback and complexity on repeated decisions from description. Organizational Behavior and Human Decision Processes, 116(2), 286–295. https://doi.org/10.1016/j.obhdp.2011.05.001
Lejarraga, T., & Hertwig, R. (2021). How experimental methods shaped views on human competence and rationality. Psychological Bulletin, 147, 535–564. https://doi.org/10.1037/bul0000324
Li, J., & Daw, N. D. (2011). Signals in Human Striatum Are Appropriate for Policy Update Rather than Value Prediction. Journal of Neuroscience, 31(14), 5504–5511. https://doi.org/10.1523/JNEUROSCI.6316-10.2011
Loewenstein, G. (1994). The psychology of curiosity: A review and reinterpretation. Psychological Bulletin, 116, 75–98. https://doi.org/10.1037/0033-2909.116.1.75
Loomes, G., & Sugden, R. (1982). Regret Theory: An Alternative Theory of Rational Choice Under Uncertainty. The Economic Journal, 92(368), 805–824. https://doi.org/10.2307/2232669
Marchiori, D., Di Guida, S., & Erev, I. (2015). Noisy retrieval models of over- and undersensitivity to rare events. Decision, 2(2), 82–106. https://doi.org/10.1037/dec0000023
Newell, B. R., & Rakow, T. (2007). The role of experience in decisions from description. Psychonomic Bulletin & Review, 14(6), 1133–1139. https://doi.org/10.3758/BF03193102
Palminteri, S., Wyart, V., & Koechlin, E. (2017). The Importance of Falsification in Computational Cognitive Modeling. Trends in Cognitive Sciences, 21(6), 425–433. https://doi.org/10.1016/j.tics.2017.03.011
Plonsky, O., & Teodorescu, K. (2020). The influence of biased exposure to forgone outcomes. Journal of Behavioral Decision Making, 33(3), 393–407. https://doi.org/10.1002/bdm.2168
Prelec, D. (1998). The Probability Weighting Function. Econometrica, 66(3), 497–527. https://doi.org/10.2307/2998573
Rigoli, F., Martinelli, C., & Shergill, S. S. (2019). The role of expecting feedback during decision-making under risk. NeuroImage, 202, 116079. https://doi.org/10.1016/j.neuroimage.2019.116079
Ruggeri, A., Stanciu, O., Pelz, M., Gopnik, A., & Schulz, E. (2023). Preschoolers search longer when there is more information to be gained. Developmental Science, n/a(n/a), e13411. https://doi.org/10.1111/desc.13411
Shani, Y., & Zeelenberg, M. (2007). When and why do we want to know? How experienced regret promotes post-decision information search. Journal of Behavioral Decision Making, 20(3), 207–222. https://doi.org/10.1002/bdm.550
Sharot, T., & Sunstein, C. R. (2020). How people decide what they want to know. Nature Human Behaviour, 4(1), Article 1. https://doi.org/10.1038/s41562-019-0793-1
Teoderescu, K., Amir, M., & Erev, I. (2013). The experience-description gap and the role of the inter decision interval. Progress in Brain Research, 202, 99–115. https://doi.org/10.1016/B978-0-444-62604-2.00006-X
Tversky, A., & Kahneman, D. (1986). Rational Choice and the Framing of Decisions. The Journal of Business, 59(4), S251–S278.
Tversky, A., & Kahneman, D. (1992). Advances in prospect theory: Cumulative representation of uncertainty. Journal of Risk and Uncertainty, 5(4), 297–323. https://doi.org/10.1007/BF00122574
van Dijk, E., & Zeelenberg, M. (2007). When curiosity killed regret: Avoiding or seeking the unknown in decision-making under uncertainty. Journal of Experimental Social Psychology, 43(4), 656–662. https://doi.org/10.1016/j.jesp.2006.06.004
Wakker, P., & Tversky, A. (1993). An Axiomatization of Cumulative Prospect Theory. Journal of Risk and Uncertainty, 7(2), 147–175.
Weiss-Cohen, L., Konstantinidis, E., & Harvey, N. (2021). Timing of descriptions shapes experience-based risky choice. Journal of Behavioral Decision Making, 34(1), 66–84. https://doi.org/10.1002/bdm.2197
Weiss-Cohen, L., Konstantinidis, E., Speekenbrink, M., & Harvey, N. (2016a). Incorporating conflicting descriptions into decisions from experience. Organizational Behavior and Human Decision Processes, 135, 55–69. https://doi.org/10.1016/j.obhdp.2016.05.005
Weiss-Cohen, L., Konstantinidis, E., Speekenbrink, M., & Harvey, N. (2016b). Incorporating conflicting descriptions into decisions from experience. Organizational Behavior and Human Decision Processes, 135, 55–69. https://doi.org/10.1016/j.obhdp.2016.05.005
Wilson, R. C., & Collins, A. G. (2019). Ten simple rules for the computational modeling of behavioral data. eLife, 8, e49547. https://doi.org/10.7554/eLife.49547
Yechiam, E., & Barron, G. (2005). The Role of Personal Experience in Contributing to Different Patterns of Response to Rare Terrorist Attacks. Journal of Conflict Resolution - J CONFLICT RESOLUT, 49. https://doi.org/10.1177/0022002704270847
Zeelenberg, M., Beattie, J., van der Plight, J., & de Vries, N. K. (1996). Consequences of regret aversion: Effects of expected feedback on risky decision making. Organizational Behavior and Human Decision Processes, 65(2), 148–158. https://doi.org/10.1006/obhd.1996.0013

It should not be confused with ambiguity, namely with the information about the underlying distribution of each option, which, as mentioned above, is fully disclosed to the subjects prior to choice and is independent of realizations of outcomes/feedback.
One broad study (Erev et al., 2017) contains other decision problems too. In our discussion we refer only to the subset of the problems involving a risky versus a sure option.
Many of the reported rates are not presented per se in the relevant publications. We obtained them combining elements of the relevant publications or by re-analyzing the raw data that we found available online or that were kindly provided to us (Jessup et. al. 2008 belong to the latter category and we would like to thank them for kindly offering their data) (see the captions of Tables S1 & S2 for more details).
We also note that, in a non-negligible fraction of considered studies, the effects were very small (1%-2%).
Yet, in Erev 2017 only one decision problem was applicable in our analysis for the loss domain (see Supplementary Material/ Erev et al. (2017) re-analysis for more details about the study and the decision problem selection)
In most cases, high probabilities are considered those greater than or equal to 0.75.
Among several things, our design improves over Erev et al., 2017 by equalizing the number of repetitions for the two conditions (their design featured five repetitions for no-feedback and twenty for feedback) and by counterbalancing the order of the conditions (in their design, no-feedback trials always preceded feedback trials).
While, Exp4 qualitatively replicates the pattern of Exp2, Exp3 shows a one-trial delay compared to the analogous Exp1: the difference between feedback and no-feedback risky rates becomes significant in the 3rd trial, instead of the 2nd. Yet, numerically, the tendency already appears from the 2nd trial.
Jessup 2008 and Lejarraga 2011 had asymmetrical probabilities for the risky option giving the best outcome (the high was .8, while the low was the more extreme .05), which combined with the relationship of the effect with high/low probabilities (see the Literature Review section) can explain the small effect. Josephs 1992 and Marchiori 2015 also differ in important aspects compared to the rest of the designs: the first one was the only study with delayed feedback (given at the end of the experiment) and the second was the only study in which the decision problems were not presented both with and without feedback within-subjects, but only in one of these conditions.
We report the notable exception of (Rigoli et al., 2019). However, it is important to note that their design substantially deviate from the others in such it featured ‘periods” of feedback or no feedback trials consisting of single shot (i.e., not repeated) decisions. This feature of the design impeded testing the learning hypothesis in their experiment, which accordingly cannot falsify it.
Critically this is also true for decision-by-sampling models, such as the BEAST proposed by Erev et al. (2017) which suppose that once feedback is available virtual outcomes are sampled in the empirical distribution. Also of note, while the BEAST model does stress the role of regret in decision-making, to our knowledge, it is not equipped to deal with the dispositional, anticipatory effects highlighted in our data.
The total number of trials, 270, includes training trials. So, we kept the complete submission of participants that had, for example, two submissions one with only a few trials and a complete one.

Tables 1 to 4 are available in the Supplementary Files section

There is NO Competing Interest.

Download PDF

Version 1

posted

You are reading this latest preprint version

Feedback-induced dispositional changes in risk preferences

Status:

Version 1

Abstract

Figures

Introduction

Results

The role of feedback and instructions on risk preferences

Discussion

Methods

Statistical analysis

Modeling

Declarations

References

Footnotes

Tables

Additional Declarations

Supplementary Files

Status:

Version 1