Virtual reality scenarios
Two virtual reality (VR) scenarios were designed in 3DS max 2015 (Autodesk, Inc.), implemented in XVR2.0 and displayed through a head mounted display (HMD) Oculus Rift DK1.
The first scenario aimed at verifying the participants’ general ability to estimate distances and controlling for any difference in sensitivities with regard to the perception of depth in a VR environment (baseline task). In this scenario, a flag was depicted in an open space in front of the participant, placed at various distances (from 0.5 to 8m, with graduations of 0.5m). The target distances used in the main task (i.e. 1, 2, 3, 4 and 5m) were shown five times, while the other distances were shown only once with the mere purpose of reducing learning effects and habituation. A total of 36 stimuli were presented. The participants were requested to verbally estimate the distance of each flag from the perceived position of their own body. The data related to the target distances were then used to normalise the individual’s perceptual errors in the main experimental task.
In the main experimental scenario, two features of the stimuli, previously found to impact the estimation of distances (Proffitt, 2006b; Scandola et al., 2019), were manipulated: i) the distance of the target and ii) the inclination of the surface. For this reason, this main scenario was identical to the baseline scenario, with the exception that in this case the flag was placed on a ramp shown in front of the participant. The stimuli thus differed in terms of the inclination of the ramp, that could be mild (4% corresponding to 7°) or steep (24%, i.e. 13.5°, Fig. 1), as well as in terms of the distance of the flags from the participant (i.e. 1, 2, 3, 4 or 5 m). Each distance/inclination combination was presented 5 times in random order, for a total of 50 stimuli in each of the two experimental conditions (i.e. with the participant sitting or standing, see below). In both scenarios, each stimulus was shown for 1s.
Questionnaires
In addition to the measures collected during the experiment, data on the types of pain were recorded by means of a questionnaire (Scandola et al., 2017). Furthermore, due to the nature of the experimental task, the potential impact of the participants’ motor imagery abilities was checked (Roberts et al., 2008). Finally, as mood disorders have been reported in fibromyalgia patients (Häuser et al., 2015), depression and anxiety were controlled (Bjelland et al., 2002).
Assessment of Pain.
In order to have a comprehensive evaluation of the different typologies of pain, the participants filled in the Verona Pain Questionnaire (VPQ, Scandola et al., 2017), which comprises a scale which differentiates between musculoskeletal, visceral and neuropathic pain, the scores for which show a high degree of correlation with both the Brief Pain inventory (Caraceni et al., 1996) and the Douleur Neuropathique 4 scale (Bouhassira et al., 2005).
In particular, the VPQ requires a self-evaluation of the minimum and maximum level of musculoskeletal, neuropathic, and visceral pain in the last two weeks, on a scale ranging from 0 (no pain at all) to 10 (worst imaginable pain) (Scandola et al., 2017).
Assessment of Motor Imagery.
To estimate Motor Imagery abilities, two subscales of the Visual Motor Imagery Questionnaire-2 (VMIQ, Isaac et al., 1986; Roberts et al., 2008) were used. The subscale External Visual Imagery (VMIQ-EVI) asks the participants to imagine themselves performing 12 actions as if they are looking at themselves from a third-person perspective (“as if you were watching yourself from an external position”). The Kinaesthetic Imagery subscale (VMIQ-K) requires participants to imagine the somatosensory feelings associated with the execution of the same actions. In both conditions, the actions are not actually performed but only imagined. Thus, the two subscales involve different cognitive processes, specifically visual imagery in the former and the simulation of bodily sensations in the latter (Ionta et al., 2010; Moro et al., 2021; Scandola et al., 2017). The participants are asked to estimate the vividness of each action imagined, on a 5-point Likert scale (with 1 = perfectly vivid imagined action and 5 = not imagined at all). The sums of the scores are considered as the final scores for each subscale (score range = 12–60, with lower scores indicating better motor imagery).
Assessment of anxiety and depression
To control for the potential effects of mood on the two groups’ performances, the Hospital Anxiety and Depression Scale (HADS, Bjelland et al., 2002) was used. The questionnaire, which provides a scale that gives scores for anxiety and depression, consists in 14 multiple-choice questions investigating the frequency (1 = never, 5 = always) with which a specific mood occurs (e.g. “I feel agitated and tense”; “I feel in a good mood”).
Procedure
Participants sat in a comfortable chair and were interviewed by the examiner in order to fill in the preliminary questionnaires (VPQ, VMIQ, HADS). After this, they wore the HMD and anti-noise headphones to isolate them from the environment. In a preliminary phase, they freely explored the VR environment.
The procedure was the same for both the baseline and the experimental tasks, with the only difference being that the baseline task was executed with the participant sitting in a comfortable chair, while the experimental task consisted of two conditions: a Standing condition (in which the participants had to complete the task while standing) and a Sitting condition (in which the task was executed with the participant sitting in a comfortable chair). These two conditions served to control the effects of any feelings of fatigue associated with the execution of the task and were presented in a counterbalanced order across participants. The stimuli in the baseline and experimental tasks were shown in random order for 1s. For each stimulus, the participants were requested to estimate the distance of the flag from themselves in centimetres and respond verbally. There were no time constraints.
After the baseline task and the main task in each of the two conditions (i.e. sitting and standing), the participants were asked to evaluate their current level of fatigue and pain on a visual analogue scale (10cm long VAS, from “no pain” or “no fatigue” to “maximum pain” or “maximum fatigue”).
Data Handling and Statistical Analyses
Preliminary analyses: Localised v. Widespread Pain and differences between Musculoskeletal, Visceral and Neuropathic Pain
One of the key features of pain in FM (that also represents a diagnostic criterion) is that it is not localised, but spread, involving different body parts (Wolfe et al., 2018). Thus, in order to confirm a difference in the degree of pain spread between the two groups, the WPI scores of FM and HC participants were compared by means of a Bayesian Linear Model with Group as independent variable.
Another preliminary analysis regarding the typologies of pain was carried out. Notwithstanding the fact that the diagnosis of FM is based on musculoskeletal pain, other typologies of pain (i.e. neuropathic and visceral pain) are often reported by FM patients (Costantini et al., 2017; Gauffin et al., 2013; Rehm et al., 2010). For this reason, since the study focuses on the potential effects of pain on distance estimation, a preliminary analysis was performed to check whether differences might arise with respect to the typology of pain in the two groups. An index of pain was calculated for each type of pain, accounting for both the minimum and maximum pain intensity: log[(maximum + 1) * (minimum + 1)], ranging between 0 (no pain) and 4.79 (worse minimum and maximum pain, both scored 10). A Bayesian Linear Model was used, with Group and Type of Pain (musculoskeletal, visceral and neuropathic) as independent variables and slope of Type of Pain grouped by participant as a random effect in order to control for within-subjects variability. The prior distributions were two Gaussian distributions (mean = 0, sd = 1 for the regressors and mean = 0, sd = 5 for the intercept) which were chosen as they have a good sensitivity with regard to the differences between the various types of pain and the groups (the regressors), but a wide overall mean range (the intercept). A series of five models were fitted, starting from a null model (i.e. only intercept, no regressors) to the saturated model (i.e. all regressors and interactions). The model that best represented the data was chosen by means of posterior probabilities based on marginal likelihoods (Gronau et al., 2020). The results of these preliminary statistical analyses suggested an absence of differences between Visceral and Neuropathic pain which were then averaged in the subsequent analyses (see Results section).
The experimental Virtual Reality task: distance estimation
An analysis of the baseline task was carried out to confirm that the distances used in the task were perceived by the participants as being progressively farther away (see Supplementary Materials, SM1). Bayesian analyses with non-informative priors were used to analyse the errors in estimating the distances and the VASs of current fatigue and pain. The inference was based on 89% Highest Posterior Density Intervals (89%HPDI) of posterior distributions (McElreath, 2016) and the Region of Practical Equivalence (ROPE, Kruschke & Liddell, 2018). ROPEs were computed as the range for a negligible effect size (Cohen, 1988), namely the interval within − 0.1 * SD, + 0.1 * SD (note that the intervals of computed ROPEs will be reported at the beginning of each analysis). This means that when the 89% HPDI is completely outside the ROPE, the null hypothesis can be rejected.
In the analysis of the distance estimations, the dependent variable was the Error in estimation, calculated as the difference between the actual and estimated distances of the flag, and converted into a z-score by means of an established procedure (Scandola, Togni, et al., 2019), using as a reference the mean and standard deviations for each distance in the baseline task (see SM1 for the analysis of the raw data of baseline estimations that confirm that participants were able to discriminate different levels of depth; see Eq. 1 for the formula used to compute the z-scores). The use of this index allowed us to limit potential biases within the sample (i.e. heteroskedasticity, extreme values) and to standardize depth perception, that in a virtual reality environment can largely vary (Lampton et al., 1995).
$${z}_{exp,distance={d}_{i,},subj={s}_{j}}=\frac{{err}_{exp,distance={d}_{i,},subj={s}_{j}}-M\left({err}_{baseline,distance={d}_{i,},subj={s}_{j}}\right)}{S\left({err}_{baseline,distance={d}_{i,},subj={s}_{j}}\right)}$$
Eq. 1: Computation of z-scores of the main experiment. The subscript “exp” indicates the “main experiment”, “baseline” the “baseline task”, “distance” refers to a particular distance \({d}_{i}\)(the computation was executed for all distances one by one), “subj” represents a specific participant \({s}_{j}\)(the computation was executed for all participants one by one), ‘M’ stands for the mean function, and ‘S’ represents the standard deviation function.
The fixed effects of the Bayesian Linear Model were the Condition (Seated, Standing), the inclination of the Ramp (4%, 24%), the Group (FM, HC), and the Distance of the flag (1, 2, 3, 4 or 5m). Since the distance from the flag might impact evaluations linearly or non-linearly, polynomial contrasts were used to capture non-linear relations (linear, quadratic, cubic and fourth order).
Moreover, the index of musculoskeletal pain and the average index of visceral and neuropathic pain were used as covariates in interaction with the other fixed effects. As random effects, we used the intercept of the individual participant and the intercept of the interaction between the participant and all of the within-subject factors (Condition, Ramp and Distance) in order to avoid pseudo-replication biases (Scandola & Tidoni, 2023).
Moreover, to determine whether the hypothetical effects on the estimation of distances were specifically related to pain or influenced by other factors, such as abilities in motor imagery and mood variables (Häuser et al., 2015), two further models were fitted, using as covariates the scores in the VMIQ subscales and the Anxiety and Depression subscales of the HADS, respectively.
To analyse any changes in fatigue and pain (i.e. the VAS estimations) after the Baseline task, and in the Standing and Sitting conditions of the experimental paradigm, a Bayesian Linear Model was used with the Group and the Condition as fixed effects, and the intercept of the participants and the intercept of the interaction between the participants and the within-subjects factor Condition as random effects.
Only the effects that show 89% HPDI (thus making it possible to reject the null hypothesis) are reported. However, the complete list of results is shown in the Supplementary Materials (Tables SM2 – for the analysis with musculoskeletal and visceral-neuropathic pain as covariates, SM3 – for the analysis with motor imagery scores as covariates and SM4 – for the analysis with depression and anxiety scores as covariates), including the Gelman-Rubin convergence diagnostic (Ȓ, Vehtari et al., 2021), the Bulk and Tail Estimated Sample Size (ESS, Vehtari et al., 2021), the Posterior Predictive Checking (Gelman, 2013), (in all cases, Ȓ ≤ 1.01, ESS > 500, and Posterior Predictive Checking show that the fitted models are compatible with the data).
All the Bayesian models were fitted with 4 chains with 1000 warmup iterations and 1000 sampling iterations, for a total of 4000 iterations.
The statistical analyses were conducted on R 4.2.2 (R Core Team, 2022), using brms 2.18.0 (Bürkner, 2018) and emmeans (Lenth, 2022) for post-hoc testing.