Trial design and settings
For the identification of the MIDs for PROs, we used the data from all participants in a registered cluster-randomised controlled trial which examined the effect of the e-health version of the Illness Management and Recovery (e-IMR) intervention compared to the standard IMR [21,22]. The trial was performed in mental health institutions that were members of the Dutch IMR network. As no relevant differences between the e-IMR and IMR was found, we pooled the data from the control and experimental groups and performed pre-post tests on different PROs to examine which PRO captured IMR’s potential benefit the most.
Data collection and outcome measures
In this paper, we used the time points before starting the IMR programme (baseline) and after a year when finishing the IMR programme (endpoint). To describe the study population, we used the participant characteristics collected at trial baseline, i.e. age, gender, psychiatric diagnoses, psychiatric and somatic comorbidities, treatment history, cultural background, housing, social economic status, education level, and diagnosis conforming the Diagnostic and Statistical Manual of Mental Disorders, 4th edition (See Table 1).
In order to respond to the aim, we used validated PROs that measured illness management, recovery, self-management, symptom severity, quality of life, and general health.
With the complementary Illness Management and Recovery Scale (IMRS), we measured illness management, consisting of 15 items. The response levels, on a five-point scale (1–5), vary depending on the item. The IMRS’ test-retest coefficient (rxx) varies between 0.79 and 0.84 [23–26].
Recovery was measured with the Mental Health Recovery Measure (MHRM), consisting of 30 items referring to self-empowerment, learning and new potentials, and spirituality [27]. The response levels, on a five-point scale, vary from ‘strongly disagree’ (0) to ‘strongly agree’ (4), with ‘neutral’ in between (2). The MHRM’s rxx is 0.92 [28].
Self-management was measured with the Patient Activation Measure (PAM-13), consisting of 13 items referring to the individual’s knowledge, skill, and confidence for managing his/her own health and health care [29]. The response levels, on a four-point scale, vary from ‘strongly disagree’ to ‘strongly agree’, and the fifth option is ‘not applicable’. The PAM-13’s rxx is 0.76 [30].
Symptom severity was measured with the Brief Symptom Inventory (BSI), consisting of 53 items referring to the burden of physical and psychological symptoms during the past month. The response levels, on a five-point scale, vary from ‘not at all’ (0) to ‘extremely’ (4) [31]. The BSI’s rxx is 0.90 [32].
Quality of life was measured with the Manchester Short Assessment of Quality of Life (MANSA) [33], consisting of 12 items rating satisfaction with their life as a whole and with 11 other social, physical, and mental health domains, on a seven-point scale, varying from ‘couldn’t be worse’ (1) to ‘couldn’t be better’ (7). The MANSA’s rxx is 0.82 [34].
The participants’ general health status was measured with the Rand 36-item Health Survey, consisting of 36 items assembled into nine concepts. In this study, we only used the concepts of general health perception (Rand-GHP) and health change (Rand-HC) [35]. The Rand-GHP estimates the participant’s current perception of their general health (bio-psycho-social) by scoring five items. On a five-point scale, participants score 1) how many times their health status hindered them in social activities; and whether they agree with the statements that 2) they become ill more easily than other people, 3) their health status is just like other people they know, 4) the expectation that their health status will decline, and 5) that their health status is excellent. The Rand-GHP’s rxx is 0.80. With the Rand-HC, participants estimate their health compared to a year ago on a five-point scale varying from much or somewhat better, the same, and somewhat or much worse. The rxx of the Rand-HC is 0.40 [36].
Statistical methods
Analyses were conducted using Statistical Package for the Social Sciences® (SPSS) 23 [37]. Mixed model multilevel regression analyses were used to examine the pre-post change in the outcome measures, taking into account clustering of participants. This method automatically uses the ‘missing at random’ assumption to handle missing data. Random effects on cluster, trainer, and individual participants nested within the cluster and fixed main effects for time trend were included in the model. The analyses were executed according to the intention-to-treat principle. Participants who did not complete the IMR sessions were included in the analyses. Non-completers are participants who attended fewer than 50% of the IMR sessions.
Methods for investigating minimal important differences
To assess on which PRO participants improved to a meaningful degree, we estimated the minimal difference that would likely be important for the participants. Four methods to estimate the MID are recommended [17,38–40]. Two are based on statistical distribution: using the effect size (ES) and the standard error of measurement (SEM), and two are anchor-based methods: using a global transition question and a clinical criterion. It is recommended to estimate the MID primarily by anchor-based methods [17,41] and to use distribution-based methods as supportive information [17]. We considered that the PRO with the highest effect/MID rate represents the most relevant change and is capable of capturing the potential benefits of the IMR programme.
Anchor-based method
For the ‘global transition question’ anchor-based method, we used the Rand-HC. In our study, the ‘year ago’ was the start of the IMR programme. The other anchor-based method uses a criterion, which is a measure health professionals are familiar with and is widely used in assessing patients’ health status [39], such as clinical endpoints or person-based global improvement in PROs [17]. Since there is no such widely used criterion in mental health, we searched for a criterion in our own data that captures the richness and variation of the construct of a Quality of Life measure (QoL) [17]. Besides the Rand-HC, we examined a number of QoL anchor candidates using the change scores of 1) the first item of the MANSA estimating satisfaction with their life as a whole (MANSA-1), 2) the total MANSA, and 3) the Rand-GHP. The strength of the association between the anchor and the PRO needs to be determined because low or no correlation can provide misleading information [17,42]. A correlation of at least 0.30–0.35 is recommended [17]. Therefore, correlations between the anchor candidates and the PROs were analysed. Outliers should not drive a correlation to a significant level. In SPSS, scores that are 2.58 times the standard deviation are assigned as a probable outlier [43]. Probable outliers were assessed on their appropriateness and impact on the correlation, and a decision was made about removing or recoding to a reasonable level [44,45]. The anchor candidate with the highest correlation with the change scores in most PROs was considered to be the right anchor. Estimation of the MID based on an anchor proceeds as follows: the scores on the anchor were used to form five groups of participants reflecting large negative, small negative, no, small positive, and large positive change. The mean of the four differences between the effects in the PROs in two succeeding change groups is the PRO’s MID-anchor [39].
Distribution-based methods
To support the anchor-based MID method, we examined the two statistical-distribution-based MID methods: the ES and the SEM [17]. The ES of change scores on the PROs estimates the effect of the intervention related to the standard deviation of the change scores (SDc), which is the endpoint scores minus the baseline scores of the participants. This relates to between-patient variation in change scores. To estimate the ES (effect/SDc), we used the estimated effects from the mixed model analyses. The one-half ES (½-ES) is standard to estimate the PRO’s MID based on the ES (MID-ES) [17,38–40]. The SEM is the measurement error of the outcome. The SEM is computed with the standard deviation (SD) and the test-retest coefficient index (SD*√(1 − rxx)) [39,44,46,47]. To estimate the PRO’s SEM, we used the SD and rxx that were reported in psychometric studies of the PROs in populations comparable to ours as much as possible (See Table 2). A change smaller than the SEM is likely a result of the measure’s unreliability rather than a true observed change. Therefore, one times the SEM (1-SEM) is the PRO’s MID based on the SEM (MID-SEM).
After these calculations, in all PROs we estimated the percentages of participants that had improved above MID-anchor, MID-ES, and MID-SEM.