Ethics information
The project was approved by Cardiff University School of Psychology’s Ethics Committee (EC.19.07.16.5653GR2A4). Participants consented at the beginning at the study and received payment and debrief after participation.
Preregistration
The study was pre-registered at [https://osf.io/huz4q/ ].
Design and materials
Myth selection: We ran two short surveys to select real-world COVID-19 myths as materials. Together these surveys yielded 11 myths for the main study (Table SI.M.1). The first survey tested a list of 39 myths sourced from the WHO’s COVID-19 myth-busters list (29)and fact checker websites (7,14). Myths were included if they had potential to influence readers’ behaviour. Fifty participants recruited from the online participant panel Prolific (50) rated how much they agreed with each myth, alongside four COVID-19 facts, in a random order, using a pointer on a visual analogue scale from strongly disagree (0) to strongly agree (100). We selected myths with above 20% average agreement to be included in this study. This process yielded five myths.
We repeated the study with a new set of 18 myths (again those with behavioural relevance) from the WHO (29) and fact checker websites (7,14,51,52) and an additional 50 participants (Prolific). One participant was removed for giving the same response (50) to all questions. Again, we selected all myths with above 20% average agreement, except for one because there was subsequent scientific debate about whether it was partially true (the effects of Vitamin D). This yielded six myths.
Correction graphics: Graphics were designed to conform to current myth-busting advice (35), aside from the manipulation. Each graphic (Fig. 1) therefore contained source information, including an NHS and COVID-19 logo, a supporting explanation statement that gave an alternative to the myth (Table SI.M.1), and an image.
Agreement questions. Participants rated their agreement with myths in response to questions that differed in style to the correction graphics to avoid pattern matching between the two (Table SI.M.1). We also included 4 fact statements,to encourage participants to use the full scale, and 2 catch questions to eliminate participants not reading the questions or using the scale in the inappropriate direction (Table SI.M.2).
Demographics questions. Participants were asked about age, education, ethnicity, vaccine concern, vaccine intentions and COVID-19 experiences (Table SI.M.3).
Procedure
Baseline: Participants completed a short set of questions measuring demographic information and personal experiences with COVID-19. They then answered the 17 agreement questions (11 myths, 4 facts, 2 catch trials), in a random order. Participants used a six-point Likert scale.
Intervention: Immediately following the agreement questions, participants were randomly assigned to one of three correction formats (question-answer, fact-only or fact-myth). They then viewed the corresponding 11 correction graphics.
Timepoint 1: Immediately following the correction phase, participants again rated agreement with the 17 statements, in a random order.
Timepoint 2 (delay): Participants completed timepoint 2 6-20 days later (M = 8.9 days), in which they again rated agreement in a new random order.
Participants
We recruited participants representative for age and gender across the UK, via Qualtrics, an online participant platform. Power calculations are described in the pre-registration. The main dataset consisted of 2215 participants who completed baseline and timepoint 1, of whom 1329 completed timepoint 2 (an attrition rate of 36%). Of these 38 were excluded for not meeting the minimum age requirement (18 years) or for failing the catch trials.
Therefore, the n for main analysis was 1291. Of these, 440 participants were randomly assigned to the question-answer condition, 435 to fact-only and 416 to fact-myth. 47% identified as “man”, 52% identified as “woman”. Age ranged from 18 to 89 years; 5% were 18-24 years, 16% were 25-34 years, 18% were 35-44 years, 24% were 45-54 years, 19% were 55-64 years, and 18% were aged above 65 years. 6% identified as Asian, 1.5% as Black, 89.6% as White and 2.9% as Mixed/multiple ethnic groups.
Replication data for timepoint 1: We also collected a partial dataset where timepoint 2 was not collected (due to an error). This data was collected three weeks prior to the main dataset, and we use it to test for replication of the main results for timepoint 1. 2275 participants were recruited and 191 were excluded for not meeting the inclusion criteria described above. 691 participants were randomly assigned to the question-answer condition, 687 to fact-only and 704 to fact-myth. 48% identified as “man”, 51% identified as “woman”. Participants ranged in age from 18-91 years; 14% 18-24 years, 21% 25-34 years, 19% 35-44 years, 19% 45-54 years, 15% 55-64 years, and 13% above 65 years. 7.7% identified as Asian, 2.2% as Black, 0.3% as Middle Eastern, 86% as White and 2.8% as mixed/multiple ethnic groups. 24% reported they were in a COVID-19 risk group, 6.6% had had a positive COVID-19 test; 8.5% reported they were healthcare workers.
Analysis approach
Linear mixed effect (LME) models were used to analyse the data. Analysis was conducted in R using lme4 (53), lmerTest (54) and lmer_alt() (afex package (55)). Random effects for participants and myths were included in the models, allowing us to generalise across both. Effects are reported as treatment contrasts with reference level according to the reported comparison (e.g. reported effect of question-answer vs fact-myth assumes question-answer as the reference level). p-values were obtained via the Satterthwaite approximation.
We obtained model convergence by starting with a model that had a maximal random effects structure design (as per advice of Barr, Levy, Scheepers, & Tilley (56)), and if that did not converge, removing correlations between intercepts and slopes items (see (57,58)). Model 1 (see below) converged with the maximal random effects structure but Models 2 and 3, which had many more parameters, required suppression of correlations between intercept and slopes. This led to successful model convergence in all cases. Thus, all models included slopes and intercepts for all factors where the design allowed, but not necessarily the correlations between intercepts and slopes.
Even with convergence there remained singularity warnings. We therefore tried simplifying the models by removing further random effects structure. However, this led to models that either failed to converge or were over-simplified (i.e. ignored obvious structure in the data) and consequently risked being anti-conservative (e.g. (59)). Moreover, wherever we obtained a simplified model that both converged and was absent of singularity warnings, significant effects present in the more complex models were also present in the simpler models. We therefore report the results of the most complex models that converged, as described by the models below.
Research Question 1: To test whether each correction format lowered agreement scores at each timepoint, we used:
Model 1: Myth_agreement ~ timepoint + (1+timepoint|participant) + (1+timepoint|myth)
Where Myth_agreement is the outcome variable, and timepoint is a fixed factor (baseline, timepoint 1, timepoint 2). Random effects (identified to the right of the pipe symbol, |) include intercepts (identified by 1 left of |) and slopes (identified by named factors after 1+), and correlations between the two. Model 1 was applied to each correction format separately (one model to question-answer, one to fact-only etc.)
Research Question 2: To compare the correction formats we used:
Model 2: Myth_agreement ~ correction*baseline*timepoint + (1+timepoint|participant) + (1+correction*baseline*timepoint||myth)
Where correction is a fixed factor with three levels (question-answer, fact-only, fact-myth), baseline is a continuous covariate corresponding to baseline scores for each participant and myth, and timepoint is a fixed factor with two levels (timepoint 1 and timepoint 2). The * strings include all main effects and interactions for the listed factors. The model includes all main effects and interactions for fixed and random effects. Correlations between intercepts and slopes were supressed for myth random effects (identified by double pipe, ||) to solve convergence problems (using lmer_alt(); see Analysis Approach above).
To replicate the results for timepoint 1 with the secondary set of participants, we simply restricted Model 2 to timepoint 1 only:
Model 2(a): Myth_agreement ~ correction*baseline + (1|participant) + (1+correction*baseline||myth)