Protocol
This systematic review is reported in accordance with the statement for Preferred Reporting Items for Systematic Reviews and Meta-Analysis - Protocols (PRISMA-P),20 and the CHecklist for critical Appraisal and data extraction for systematic Reviews of prediction Modelling Studies (CHARMS).21 The protocol of this systematic review was registered on the PROSPERO International prospective register of systematic reviews (CRD42020160988).
Eligibility criteria
We will include studies that meet all the following criteria:
-
Study type: prognostic prediction model development and/or external validation study with or without model updating. Clinical prediction model was defined as the following criteria: (1) a self-report questionnaire and (2) assesses multiple factors or constructs related to the probability of or risk for the future occurrence (prognosis) of a particular outcome.7
-
Participants: (1) adults aged 18 or over; (2) with ‘recent onset’ LBP (i.e. less than 3 months duration); (3) with or without leg pain.
-
Model predicts any of the following outcomes: pain; disability; sick leave or days absent from work or return to work status; and self-reported recovery.
-
Time period of prediction: follow-up of at least 12 weeks duration.
Information sources
Systematic searches will be conducted of MEDLINE, Embase and CINAHL, from the inception of these databases until January 2020. Additional strategies to ensure all eligible studies are identified will include examination of reference lists from all included studies and citation tracking of included studies.
Search
The search strategy will include LBP terms suggested by the Cochrane Back and Neck Review Group22 and terms related to clinical prediction model studies as suggested by Ingui.23,24 The full search strategy is in Additional file 1. No search limits will be applied.
Study selection
Two reviewers (T.S. and F.S.) will independently screen all studies by title and abstract and exclude clearly irrelevant studies. For each potentially eligible study, two reviewers (T.S. and F.S.) will independently screen the full-text article and assess whether the study fulfilled the inclusion criteria. In cases of disagreement, a decision was made by consensus or by a third reviewer (L.P.C.) if needed.
Data extraction
The data will be extracted by two independent reviewers and in cases of disagreement consensus will be reached by discussion between the reviewers or by arbitration by a third reviewer. Authors will be contacted by email in order to obtain any additional information that might not be reported in the original articles.
Data items
Where available, the following summary data will be extracted from each study: type of study, source of data, participants, outcome predicted, candidate predictors, sample size, missing data, model development, model performance, model evaluation, results, authors interpretation and the information about a conclusion of the calibration graphs will be described. Where possible, measurements of discrimination will be extracted for the related outcomes: pain intensity as measured using a visual analogue scale, numeric rating scale (NRS), verbal rating scale or Likert scale; disability as measured by validated self-report questionnaires; sick leave or days absent from work or return to work status; self-reported recovery using a global perceived effect scale, a verbal rating scale, or a Likert (recovery) scale.
Risk of Bias of individual studies
The risk of bias of the included studies will be assessed by the PROBAST (Prediction model Risk Of Bias ASsessment Tool),25,26 recently developed through a consensus process involving a group of experts in the field. PROBAST includes 20 signalling questions across 4 domains: (1) participants, (2) predictors, (3) outcome, and (4) analysis. The questions are answered as yes (Y), probably yes (PY), no (N), probably no (PN), or no information (NI). The answers to these signalling questions assist reviewers in judging the overall risk of bias for each domain. A domain where all signalling questions are answered as Y or PY should be judged as “low risk of bias.” An answer of N or PN on 1 or more questions flags the potential for bias, whereas NI indicates insufficient information. Information and methodological comments that support the item assessment will be recorded. The studies will be rated as having low risk of bias, potential for bias, or insufficient information based on the 4 domains. Two independent reviewers will assess the risk of bias of the studies and discrepancies will be resolved by consensus, and if necessary, a third author will resolve any disagreement.
Overall quality of evidence
The overall quality of evidence will be rated using the Hierarchy of Evidence for Clinical Prediction Rules designed by Jull, DiCenso, and Guyatt.27 The hierarchy of evidence can guide clinicians and researchers in assessing the full range of evidence supporting the use of a clinical prediction rule in their practice. The strength of recommendation is determined based upon the stage of the clinical prediction model regarding development, validation (and the quality of validation) and impact. Table 1 describes the Hierarchy of Evidence for Clinical Prediction Rules.
Table 1
Hierarchy of Evidence for Clinical Prediction Rules.
Level IV
|
Rules that need further evaluation before they can be applied clinically
|
|
These rules have been derived but not validated or have been validated only in split samples, large retrospective databases, or by means of statistical techniques
|
Level III
|
Rules that clinicians may consider using with caution and only if patients in the study are similar to those in their clinical setting
|
|
These rules have been validated in only one narrow prospective sample.
|
Level II
|
Rules that can be used in various settings with confidence in their accuracy
|
|
At this level, rules must have demonstrated accuracy either by one large prospective study including a broad spectrum of patients and clinicians or by validation in several smaller settings that differ from one another.
|
Level I
|
Rules that can be used in a wide variety of settings with confidence that they can change clinician behaviour and improve patient outcomes
|
|
At this level, rules must have at least one prospective validation in a different population plus one impact analysis, along with a demonstration of change in clinician behaviour with beneficial consequences.
|
Summary measures
Predictive validity is usually assessed by measures of discrimination and calibration. Discrimination indicates how well the model differentiates between those who recover and those who do not.7 Calibration refers to how closely the predicted risk agrees with the observed risk.7
Synthesis of results
Meta-analysis may be conducted if adequate on discrimination exists for a single clinical prediction model, specific outcome, and considering only the results of validation studies. For data pooling to be appropriate, we will also require that (1) the outcome measure is defined consistently, (2) the clinical settings are similar (e.g. all primary care), and (3) uniform statistical analyses have been applied. Calibration findings will be descriptively synthesised.