Although not necessarily synonymous with ageing, the prevalence of frailty is increasing in the surgical spine population 1, 70. This is concerning as frail patients undergoing spine surgery are at an increased risk of adverse postoperative outcomes 8. Accordingly, the assessment of frailty is an important factor in the surgical decision-making process regarding surgical risk, invasiveness, and timing. However, the applicability of these instruments as risk stratification or frailty trajectory tools is unknown. This is due to the heterogeneity and lack of consensus with frailty tools currently reported and the effect of underlying spine disease on frailty.
Similar reviews assessing the clinimetric properties and applicability of frailty tools have been completed in different contexts 16, 18, 71, 72. To our knowledge, this review is the first to evaluate the objectivity, feasibility, applicability, and sensitivity of frailty tools reported in the surgical spine literature. Additionally, this systematic review is the first that has rigorously evaluated the clinimetric properties of frailty tools reported in the surgical spine literature using a validated set of qualitative criteria and definitions. One of the most important outcomes identified in our review is that although most tools were predictive of postoperative outcomes, many lacked formal evaluation of important clinimetric properties. Additionally, several frailty measures were not objective or clinically feasible. This was due to items (subjective questions) or techniques (lengthy questionnaires) common to these measures that cannot be reliably or reasonably completed in clinical practice.
Risk Stratification Tools
The mFI, developed and validated by Velanovich et al, was constructed by matching 11 variables found within the National Surgical Quality Improvement Program (NSQIP) database to those within the 70-item Canadian Study of Health and Aging frailty index (CSHA-FI) 73. Since its development, the mFI has been extensively validated as a risk stratification tool for predicting postoperative AEs across the surgical literature 74. In recent years, an increase in the missing proportion of variables required to calculate the mFI has raised concern about its validity as a risk stratification tool 75. To overcome this, Chimukangara et al identified the top five most reported mFI variables within the NSQIP database, condensing the mFI into the 5-item mFI 76. Across the surgical literature, the 5-item mFI is recognized as a valid risk stratification tool for predicting postoperative AEs 76–78.
Within the degenerative and deformity populations undergoing complex spine surgery, the mFI and 5-item mFI are sensitive risk stratification tools for predicting postoperative AEs. These tools have been validated using a robust study methodology in large cohorts with accurate, precise and reproducible risk estimates. Additionally, the mFI and 5-item mFI are reliable tools given the high degree of concordance between their respective frailty tiers. Lastly, since few deficits are required to assess frailty, both tools are easily applicable without the need for an extensive chart review, special tests or training.
The mFI is not a sensitive risk stratification tool in the non-complex degenerative, tumor, or trauma spine populations due to conflicting evidence, poor study methodology, and construct limitations of the mFI. Since the mFI is mainly composed of deficits that assess comorbidity status, it is not sensitive for assessing the multiple systems affected by frailty. Consequently, in healthy patients with little to no comorbidities undergoing spine surgery, the mFI is significantly underpowered as a risk stratification tool 24, 27. In the tumor population, the construct does not account for the physiological effects of metastatic disease, such as tumour burden and adjunctive therapy. These factors influence underlying physiological reserve and confound the relationship between frailty and postoperative AEs 35, 36, 79. Within the thoracolumbar trauma population, poor study design and insufficient evidence limit the validity of the mFI as a risk stratification tool. Finally, in the tSCI population, the magnitude of the injury, patient age, and total motor score on admission overpowers any association between the mFI and postoperative AEs 33.
The constructs of the mFI and 5-item mFI significantly deviate from the general multisystem concept of frailty. A valid frailty index must contain 30–40 deficits in which each deficit covers a range of systems, is associated with overall health status, increases in prevalence with age, and cannot saturate early 80. Frailty indices containing few deficits, such as the mFI and 5-item mFI, are prone to instability and imprecise index estimates 80. Furthermore, during the design of the mFI and 5-item mFI, the reduction of frailty deficits from the 70-item CHSA-FI was performed without analysis of convergent validity 81. This raises concern as to whether the mFI and 5-item mFI are of the same degree of construct as the CHSA-FI. Lastly, the non-modifiable constructs of the mFI and 5-item mFI limit the sensitivity of these frailty tools to capture clinical changes. Yagi et al identified that despite optimization of each mFI factor, no significant reduction in postoperative AEs was observed when compared against the non-frail cohort 30. Therefore, the mFI and 5-item mFI are applicable as risk stratification tools only.
The ASD-FI, developed by Miller et al, was constructed using variables within the International Spine Study Group (ISSG) database that met the frailty index inclusion criteria 48. Cutoff values were then applied to stratify the population into robust, frail, and severely frail cohorts. Since its development, the ASD-FI has demonstrated to be a valid risk stratification tool for predicting postoperative AEs within the complex adult spinal deformity population. The ASD-FI also has several strengths as a risk stratification tool compared to the mFI and 5-item mFI. The ASD-FI was developed using a standard methodology for creating accurate and precise frailty indexes 50. The ASD-FI is also a more sensitive frailty tool as it evaluates a greater number of health domains within the frailty syndrome. The ASD-FI has also been extensively validated within the complex adult spinal deformity population as a risk stratification tool. In a series of studies by Miller et al, the ASD-FI reliably predicted 2-year postoperative AEs in external and internal validation cohorts 48–50. The mFI and 5-item mFI were validated in either a large national cohort with limited follow-up periods, underestimated complication rates and missing patient variables; or in small cohorts where patient age, lifestyle, and ethnicity impact surgical outcomes 28, 29. However, the number of deficits required to calculate the ASD-FI makes it clinically unfeasible. Given this, the mFI and 5-item mFI are more appropriate risk stratification frailty tools in the adult spinal deformity population.
The CD-FI was developed in the same fashion as the ASD-FI for use in the cervical deformity population as a risk stratification tool 80. Passias et al further condensed the CD-FI to a 15-item mCD-FI by identifying the health deficits most predictive of the overall CD-FI score 56. The CD-FI and mCD-FI were internally validated as risk stratification tools in the cervical deformity population 53, 55, 56. However, it is unknown whether these measures are valid or sensitive risk stratification tools for predicting postoperative AEs or functional outcomes. This is due to the lack of external validation studies, conflicting evidence, and poor methodological design of the current validation study 55.
As the ASD-FI and CD-FI contain several modifiable frailty deficits that overlap with clinical features of spinal disease, these measures are sensitive to capturing the effect of spine surgery on postoperative frailty trajectory. Segreto et al identified a significant reduction in 1-year postoperative CD-FI scores following spine surgery for cervical deformity 54. However, responsiveness was evaluated by a t-test that only compares differences in the score. This methodology does not assess the validity of the score change in relation to the CD-FI construct to capture responsiveness. Accordingly, the ASD-FI and CD-FI are more appropriate risk stratification tools given the lack of literature assessing the responsiveness of these measures.
Although the ASD-FI, CD-FI, and mCD-FI are promising frailty tools, some concerns may limit the applicability of these tools. Firstly, the cutoff values chosen to stratify frailty severity were determined without any formal analysis. The health deficits included within these tools were also derived from questionnaires commonly utilized in spine practice. Consequently, the ASD-FI, CD-FI, mCD-FI may overestimate frailty and the associated predicted risk. Additionally, no formal sensitivity analysis has been performed assessing the performance of these measures against other frailty tools. Lastly, the need to acquire all 42 deficits to calculate the ASD-FI and CD-FI significantly hinders the clinical applicability of these tools.
The MSTFI and PSTFI were constructed as risk stratification tools for the metastatic and primary spine tumor populations 62, 64. De la Garza Ramos et al constructed the MSTFI by identifying patient recorded variables within a national multicenter database that had the greatest independent effect for predicting postoperative AEs 62. Nine variables were identified to construct the MSTFI, and cutoff scores were applied to stratify patients into robust, mild, moderate, and severe frail cohorts. The PSTFI was developed using items within the MSTFI, except those pertaining to surgical approach 64. Cutoff values were similarly applied to stratify patients according to frailty severity.
Within the metastatic spine tumor population, both the mFI and MSTFI demonstrated significant heterogeneity and difficulty in predicting postoperative AEs 34, 62, 63. Initial validation by De la Garza Ramos et al suggested the MSTFI was an appropriate risk stratification tool 62. However, external validation by Massaad et al identified that the predicted outcomes stratified by MSTFI severity were not consistent with those reported in the initial validation study 63. The authors observed that the MSTFI overestimated the risk of postoperative AEs for severely frail patients while underestimating the risk for mildly frail patients 63. Bourassa-Moreau et al observed that neither the mFI nor MSTFI were associated or predictive of postoperative AEs 36. Consequently, given the heterogeneity and inconsistency, no recommendation can be made as to whether the mFI or MSTFI are appropriate risk stratification tools for this spine population. This highlights the challenge of defining and quantifying frailty in the metastatic spine tumour population. Further efforts are required to improve the determination of frailty in this specific surgical cohort.
Similarly, determining the most sensitive frailty tool for the primary spine tumour population is difficult. Our review observed that the mFI and PSTFI weakly predict postoperative AEs with large confidence estimates and relatively poor sensitivity. Additionally, patients with primary spine tumors are often younger and less likely to have comorbidities or present with clinical features of frailty. Consequently, comorbidity-based frailty tools such as the mFI or PSTFI are not sensitive for evaluating frailty within this population. Additionally, since the PSTFI is derived from the MSTFI, it is poorly sensitive for assessing frailty in the primary spinal tumour population.
As frailty tools, the construct of the MSTFI and PSTFI are not designed to evaluate frailty. The MSTFI and PSTFI contain surgical, radiographic, and laboratory items that are not sensitive or specific to frailty. The limited number of deficits within these frailty tools is also problematic. It increases the potential for imprecise index estimates, and when applied to small healthy cohorts, the lack of deficits significantly reduces the ability to detect a relationship with adverse outcomes 36. The cutoff values applied to stratify frailty severity were also chosen without any formal assessment. Finally, given the non-modifiable constructs of these measures, the MSTFI and PSTFI are only applicable as risk stratification tools. The need for medical imaging or extensive chart review may hinder these measures’ feasibility due to extensive time requirements.
Similar to the mFI and the 5-item mFI, the FBS was constructed using commonly reported variables within the NSQIP database 82. The FBS was initially validated as a risk stratification tool for the vascular surgery population 82. Medvedev et al further validated its use as a risk-stratification tool in the surgical spine population to predict postoperative AEs 65. However, the clinical applicability of the FBS and its most sensitive surgical spine population cannot be determined for several reasons. The FBS was validated in a heterogeneous cohort without any formal analysis adjusted for cervical pathology. Consequently, it is unknown whether the FBS is more sensitive to a subtype of cervical spine pathology. The FBS has also not been externally validated, raising concern about its validity as a risk stratification tool. Finally, due to its non-modifiable construct, the FBS is only applicable as a risk stratification tool.
The modified frailty score (MFS) is a 19-item frailty index validated by Patel et al for predicting mortality in the orthogeriatric population 69, 83. It was constructed by including 19 of the 70 deficits within the CSHA-FI 83. The MFS is associated with higher rates of 30-day postoperative mortality following spine surgery for tuberculous spondylodiscitis 69. However, no formal analysis was performed to evaluate its predictive validity, limiting its applicability as a risk stratification tool. Many clinimetric properties of the MFS have also not been assessed. The 19 deficits included from the 70-item CHSA-FI were arbitrarily chosen without any formal analysis of convergent validity. Despite these limitations, the MFS assesses a greater number of frailty domains than other deficit accumulation measures reported in the surgical spine literature. Accordingly, the MFS is a more sensitive frailty tool in healthy populations and is less prone to instability and poor index estimates.
The Hospital Frailty Risk Score (HFRS) is a validated risk stratification tool that incorporates administrative coding into the assessment of frailty. Initially constructed by Gilbert et al 84, the HFRS contains 109-items health-deficits derived from International Classification of Disease – 10 (ICD-10) codes collected upon admission to hospital. The HFRS can be calculated from routinely collected data within electronic medical records without the need for extensive chart review. The HFRS demonstrated to be a valid risk stratification tool for predicting postoperative AEs following spine surgery for degenerative spine conditions. Similar studies validating the HFRS in non-spine surgical populations have demonstrated equivocal or superior findings for the HFRS to predict postoperative AEs 85. Given this, the HFRS is a sensitive risk stratification tool in the degenerative spine population. However, the technological requirements needed to use the HFRS may limit its applicability.
As a frailty tool, the HFRS differs from traditional deficit accumulation tools reported in the literature. The HFRS is calculated from ICD-10 codes, which are individually scored based on the prevalence of the health deficit and individual association with adverse health outcomes. Accordingly, the HFRS is a more reliable and accurate tool as the estimated risk is adjusted for the health deficits that contribute to frailty. However, many of its clinimetric properties have not been formally assessed. Gilbert et al acknowledged difficulties designing the HFRS from ICD-10 coded data as these health-deficits do not capture the multisystem and dynamic progression of frailty 84. Consequently, the predictive abilities of the HFRS may be overstated compared to other frailty tools that capture the dynamic features of frailty such as functional states, phenotypic characteristics, caregiver support and fluctuations influenced by acute illnesses. Additionally, given its design and primary application as a risk stratification tool, its role as a frailty trajectory tool is significantly limited.
The Risk Analysis Index (RAI), constructed by Hall et al, is a 14-item questionnaire designed for assessing frailty in surgical patients 86. It is recognized as a valid risk stratification tool for predicting postoperative AEs and identifying patients requiring preoperative optimization within the elderly surgical population 87. Within the surgical spine population, pre-frail and frail RAI scores were associated with adverse postoperative outcomes. However, multiple limitations are present within the validation study. Many of the postoperative outcomes studied occurred at an exceeding low frequency, likely creating a type 2 statistical error that underpowered the predictive validity of the RAI. A selection bias further compromises the validity of the RAI as Agarwal et al failed to report the number of patients with complete or missing RAI and outcome data 67. Additionally, the statistical analysis did not adjust for confounding patient and operative variables. Given these limitations, no recommendation can be made regarding whether the RAI is a sensitive risk stratification tool within the surgical spine population as further validation studies are needed.
Similar to the HFRS, the RAI differs from traditional frailty tools. Using predefined criteria, the RAI assesses multiple frailty domains to create a weighted score representative of the patient’s frailty state. The content of the RAI is more sensitive for assessing frailty as it is adapted from the previously validated Minimum Data Set (MDS) Mortality Risk Index-Revised (MMRI-R) 86. Additionally, the RAI uses a defined set of items and a standardized scoring system to eliminate potential inter-rater bias or error amongst users. As the RAI has only been recently investigated in the surgical spine population, many of its clinimetric properties remain unknown. Further investigation is ultimately warranted to determine its validity and reliability in the surgical spine population. Lastly, given that the RAI is validated as a perioperative risk stratification tool, its role as a frailty trajectory tool is limited despite a modifiable construct.
Lastly, the Comprehensive Geriatric Assessment (CGA) tool assesses frailty based on a multidisciplinary approach for optimizing, coordinating and integrating geriatric care. The CGA evaluates the frailty domains of function, cognition, mood and mental health, nutrition, comorbidity status, polypharmacy, and social health using validated subscales. The CGA is validated as both a risk stratification tool and an instrument for guiding preoperative optimization of frail patients 88. Within the spine population, Chang et al recently validated the CGA as a risk stratification tool for predicting postoperative AEs in elderly patients after lumbar spine surgery for degenerative disease 68. Despite a relatively small population, the study had a robust study methodology with strict inclusion criteria to assess the predictive validity of the CGA. The components of the CGA also had defined values for each frailty component evaluated from either the original articles or subsequent validation study 68. However, the criterion to define frailty was chosen arbitrarily without formal sensitivity or construct validation. The sample population was also relatively heterogeneous, raising concern for type II error and a lack of statistical power. Despite these limitations, the CGA is a valid and sensitive risk stratification tool for predicting postoperative AEs within the degenerative lumbar spine population.
As a frailty tool, the CGA is highly sensitive for assessing and quantify frailty. Given its construct, the CGA differs from previously discussed frailty tools that contain non-validated or arbitrary content to evaluate and define frailty. The CGA may be a valuable screening tool to help guide perioperative optimization of frail patients undergoing surgical intervention. CGA targeted optimization has improved functional outcomes and reduced mortality in the community and hospital-dwelling frail population 89, 90. Furthermore, the CGA may be sensitive to capturing the relationship between spinal disease and frailty as it contains components susceptible to improvement following spine surgery. Despite these strengths, the CGA lacks standardized content, delivery, and interpretation, potentially limiting cross-population validity and reliability 91. Further studies are warranted to establish its clinimetric properties and determine the validity, reliability, and responsiveness in the surgical spine population.
Frailty Trajectory Tools
The FRAIL (fatigue, resistance, ambulation, illness, and weight loss) Scale is a validated five-item frailty tool developed by the International Academy on Nutrition and Ageing Task Force 92, 93. The conceptual foundations are heavily rooted in the phenotypic frailty model as four of the items (fatigue, resistance, ambulation, and weight loss) are derived from it. Validated cutoff values are used to stratify scores into robust, pre-frail, and frail patients. Since its conceptualization, the FRAIL Scale has proven to be a reliable and valid frailty tool for identifying elderly patients at an increased risk of adverse health outcomes 94. Based on our review, the FRAIL Scale predicted a lower likelihood of postoperative functional return and a higher risk of postoperative delirium in patients undergoing elective spine surgery for degenerative disease. These findings are important considering spine surgery aims to improve functional outcomes back to baseline or surpass them. Failure to return to, or surpass baseline function is concerning as spine surgery is associated with significant risks. Given this, the FRAIL Scale may be a valuable tool in the decision-making process to identify patients requiring timely surgical intervention or preoperative optimization.
The Fried Phenotype is a five-item frailty tool developed by Fried et al 6. Constructed and validated by Fried et al, the tool assesses five items including weight loss, weakness (strength), exhaustion (endurance), slowness (gait speed), and low physical activity (kilocalories) 6. Validated cut off-values are used to stratify scores into robust, pre-frail, and frail patients 6. Since its initial validation, the Fried Phenotype is recognized as a reliable, valid, diagnostic, and sensitive assessment tool for identifying frail patients at an increased risk of early disability, morbidity, and mortality 70, 95. Interestingly, our review identified that the Fried Phenotype did not predict postoperative AEs within the thoracolumbar population undergoing elective spine surgery for degenerative or deformity spine conditions. This may have been due to several factors. Firstly, the cohort size of the validation population was relatively small, therefore increasing the risk of potential bias’ and reducing the statistical power of the risk estimates. The relationship between the Fried Phenotype and postoperative AEs may have also been confounded by the Timed Up and Go (TUG) test. As a test of physical impairment, the TUG inherently captures phenotypic elements of frailty, therefore confounding the relationship between the Fried phenotype and postoperative AEs.
Of the frailty tools identified in our review, the FRAIL Scale and Fried Phenotype are the most sensitive for capturing the impact of spinal pathology and surgical intervention on frailty trajectory. The underpinning phenotypic construct overlaps with those clinical features of disability and weakness associated with spinal disease 13. Given this, if spine surgery aims to improve functional outcomes, the modifiable construct of the Fried Phenotype and FRAIL Scale are sensitive to capturing changes in frailty trajectory. Although this relationship has not been studied in spine literature, both the Fried Phenotype and FRAIL Scale have been observed as responsive tools for capturing changes in frailty trajectory 72.
The FRAIL Scale and Fried Phenotype are also potentially useful assessment tools for screening and tracking responsiveness to frailty targeted preoperative rehabilitation 96. Over the past several years, prehabilitation has gained popularity in the literature as a means of preoperatively optimizing patients’ health to improve postoperative outcomes 97. Rudimentary in their composition, mode of administration and outcome measure, preliminary evidence suggests these programs may reduce the risk of postoperative AEs 97. Although no program has been described in the spine literature, tailored preoperative physiotherapy improves and maintains postoperative functional outcomes in patients with degenerative lumbar spine disorders 98. Considering the relationship between degenerative lumbar disease and frailty, preoperative optimization of frailty may be critical in improving outcomes following spine surgery.
Though, developing a frailty-targeted prehabilitation program is challenging due to the uniqueness of health-deficits to each patient. The CGA may overcome this challenge as it is a powerful screening tool for identifying health deficits susceptible to optimization and tailoring multidisciplinary interventions. Initial studies investigating CGA and frailty targeted prehabilitation with nutrition and exercise interventions have found mild phenotypic and functional improvements in hospitalized and community-dwelling geriatric patients 88, 90, 99, 100. However, it is unknown whether these improvements significantly reduce adverse outcomes, especially in the surgical context. Studies are ultimately needed to determine the most effective method of identifying susceptible health-deficits and clarifying the composition, mode of administration, and clinical efficacy of prehabilitation programs.