We demonstrated how a distribution-based approach using systematic review methods can estimate MCIDs for scales reporting an outcome of interest. We found that our distribution-based approach derived MCIDs that were similar to accepted MCIDs for measuring changes in cognitive function in persons with Alzheimer disease [6, 20]. However, MCIDs derived from baseline scale score SDs were more precise than MCIDs derived from mean change scale score SDs, perhaps because mean change scale score SDs are dependent on baseline values and there are potential ceiling and floor effects associated with scales [21]. Furthermore, the least precise MCIDs were derived from a pooled estimate based on only three RCTs; therefore, deriving MCIDs from few studies may be less precise. We demonstrated how the pooled SD based upon only three RCTs was influenced by one study. When this study was removed, the MCIDs were similar to MCIDs in our primary analysis. The distribution-based method could be used where MCIDs for an outcome measure are not available; our approach could enhance knowledge user understanding of study results and facilitate planning of future studies through assistance with sample size calculation.
Our derived ADAS-Cog and MMSE MCIDs are similar to published MCIDs [6, 20, 22]. Using an anchor-based method, Schrag et al., found that persons with Alzheimer disease who had clinically important worsening on any of four anchor questions (memory, non-memory cognitive function, Functional Activities Questionnaire and Clinical Dementia Rating Scale) had a change in ADAS-Cog score of 2.7 to 3.8 points [22]. When Schrag et al., implemented a distribution-based method to estimate MCIDs at 0.5 SDs (using baseline ADAS-Cog score SDs), MCIDs ranged from 3.3 to 4.9 points for participants with a clinically meaningful decline on anchor questions [22]. Using an anchor-based approach, Rockwood et al., compared changes on the ADAS-Cog to clinician’s interview based impression of change-plus caregiver input scores, patient/carer-goal attainment scaling, and clinician-goal attainment scaling. Rockwood et al., found that a change of 4 points on the ADAS-Cog was clinically important for persons with Alzheimer disease [20]. Our derived range of MCIDs for the ADAS-Cog encompasses these published MCIDs. Similarly, investigators from the DOMINO trial agreed that the MCID for a change in MMSE was 1 to 2 points among persons with Alzheimer disease [6]. Using a distribution-based approach, they estimated similar MCIDs for changes in MMSE scores, which ranged from 1.4 (assuming a distribution of 0.4 SDs) to 1.7 (assuming a distribution of 0.5 SDs) points [6]. Our derived range of MCIDs for the MMSE encompasses these published MCIDs as well. In contrast, using a survey of clinicians’ opinions, Burback et al., found a MMSE MCID of 3.72 (95% confidence interval 3.50 to 3.95) points.[12] However, pooled SDs estimated from baseline and mean change MMSE scores in our meta-analysis were 4 and 3.6 (Table 1) points, respectively [14]; a MCID of 3.72 points represents a very large effect size [9, 14].
There are advantages to deriving MCIDs using systematic review methods and a distribution-based approach. Systematic reviews use explicit methods to synthesize evidence, which minimizes bias in the derivation of effect estimates and their associated measure of uncertainty [23]. Systematic reviews facilitate the generalization of results beyond any one study [23]. This is particularly important in the estimation of a MCID using our proposed distribution-based approach because a MCID is meant to be applied across a broad range of clinical scenarios. As demonstrated in our results, there is substantial variability in the distribution of uncertainty across individual studies. In general, systematic reviews also improve the accuracy of conclusions about the efficacy or safety of an intervention across study settings, which is why MCIDs derived with similar methods, could also improve accuracy. Our proposed distribution-based approach could help knowledge users to assess whether an intervention has an effect on the outcome of interest over a range of clinically meaningful values (0.4 to 0.5 SDs) [11].
If an outcome in a meta-analysis is reported with more than one scale, the pooled standard deviation (SDpooled) estimated from systematic review data can also facilitate back-transformation of standardized mean differences derived from meta-analyses to mean differences. To derive a mean difference (MDj) from a standardized mean difference (SMDj), multiply SDpooled by each standardized mean difference (SMDj), as follows: MDj = SMDj x SDpooled. Researchers often either interpret a standardized mean difference with respect to thresholds first proposed by Cohen (i.e. 0.2 SDs represented a small difference and 0.8 SDs represented a large difference) or they back-transform standardized mean differences to mean differences, as described in the Cochrane Handbook of Systematic Reviews for Interventions [9, 17]. However, the Cochrane Handbook suggests using SDs derived from an observational study related to the systematic review topic [17]. While these observational data may be reflective of a real-world distribution of effect sizes, there are various biases that systematic reviewers must consider when deciding on which observational study to use, specifically, indication bias associated with comparing an intervention group to a non-intervention group in observational studies of interventions [24]. The influence of biases on a pooled SD (and their impact on derived mean differences) derived from RCTs included in a systematic review can be tested in sensitivity analyses, which can increase confidence in findings.
There are limitations to using our proposed distribution-based approach. It is unclear if MCIDs generated by this approach are generalizable to all situations in which a scale is used. For example, MCIDs derived from the systematic review and network meta-analysis of the comparative effectiveness and safety of cognitive enhancers (cholinesterase inhibitors and memantine) for treating Alzheimer disease might not be generalizable to MCIDs for these scales if using a nonpharmacologic intervention (e.g. exercise, cognitive training); however, MCIDs for determining meaningful changes in pain scores for patients with osteoarthritis did not vary across pharmacologic (nonsteroidal anti-inflammatories), nonpharmacologic (i.e. rehabilitation), or surgical (i.e. total hip replacement, total knee replacement) interventions [25]. And, similar to other distribution-based approaches, the anticipated distribution of uncertainty may vary based on effect modifiers; therefore, it will be important to consider a plausible distribution of values for the MCID (i.e. 0.4 to 0.5 SDs) when interpreting results [6, 9, 10]. These limitations will need to be explored in future studies.