To the authors' knowledge, this is the first article describing a reliability study of a US examination to determine TLFD. For this reason, a clinical non-laboratory method was chosen and additionally tested for its ability to discriminate between aLBP patients and healthy individuals.
Interrater reliability was excellent (ICC(2,2) = .92) and interrater reliability was good (ICC(2,2) = .79) with unbiased repeated measures. These results are comparable to other US assessments in LBP patients, e.g. of the multifidus muscle or of motor control, which are recommended for diagnosis, demonstrating their suitability as a direct examination technique for the clinical appraisal of TLFD30,39,40. While most US studies that quantified the gliding or deformation of the TLF did not test the method for reliability8,22,26,41,42, Langevin et al.12 controlled the intra-rater reliability of their calculation method and found an ICC of .98, which was slightly higher than our results.
Given the current lack of appropriate assessment tools for monitoring treatment effects and the socioeconomic explosiveness of LBP, this could provide a way to obtain diagnostic information for an additional parameter, TLFD43. The MDC of one examiner of 5.54 mm could support the suitability of the US method to a certain extent. However, the 48% higher MDC, determined with two examiners, of 8.7 mm could represent a limit for high-precision measurements in experimental environments 43.
Analysis of the ROC curve yielded a cut-off point of 6 mm to discriminate aLBP patients from healthy individuals, meaning that in the case of aLBP, the individual has a TLFD of less than 6 mm. This value identified all aLBP patients in the study population (sensitivity of 100%) and also separated 93.75% (specificity) of healthy individuals from them. Agreement between the two independent US raters was moderate (κ = 0.74; Gwet’s AC1 = 0.75). Although 6 of 63 assessments were incorrect, this detection rate is considerably more favorable compared to other paraspinal US assessments in LBP 44–46 and superior to most manual palpation-based approaches 44. In the past, US has been reported to be incapable of detecting abnormal echogenicity of paraspinal tissue in LBP 45,47. Nazarian et al. 45 therefore reported in a systematic review that the majority of ROC curve analyses were below chance value, with an interrater agreement of κ = -0.06, indicating that the overall discriminatory ability of US is worse than the likelihood.
It is likely that these results have hindered the further development of US technologies and methods in the field of LBP diagnosis over the next few years. However, recent research emphasizes the potential of US diagnosis, especially in dynamic examinations of myofascial structures. Cuesta-Vargas et al. 48 studied LBP patients and healthy controls measured during a TET and classified them based on muscle activation measured with electromyography, pennation angle and erector spinae muscle thickness. Using this method, they were able to identify 48.5% of LBP patients (sensitivity) and 84.8% of healthy individuals (specificity). The US method of determining the TLFD in this study clearly exceeded these results.
Taken together, the results of this study, the US measurement of TLFD presented here, are a promising tool for clinical purposes, firstly for identifying patients with acute LBP and secondly for monitoring treatment progress in the context of an intervention. In addition, it could be helpful to identify TLFD restrictions as a possible risk factor for the development of chronic LBP, as only one third of these patients recover within three months4. Swelling and thickening of the TLF, e.g. due to micro-injuries or hypoxia, could lead to densification, inflammation, and a more adhesive behavior of the gliding surfaces between the fascial layers20,49,50. This is probably an additional parameter in the complex multifactorial cascade of LBP development and most likely already present in the early stages15. Brandl et al.14,51 demonstrated that artificially induced aLBP is associated with swelling and typical pain patterns that can be localized in the fascial tissue. Monitoring this process is therefore a promising method in the treatment of LBP. This should be seen in the absence of alternative imaging techniques such as computed tomography or magnetic resonance imaging, whose suitability for the dynamic assessment of pathologically altered anisotropic tissue material may be limited43. Here, even the relatively high MDC, which was determined on the basis of interrater reliability, exceeds the accuracy of most other conventional diagnostic procedures44–46.
Study Limitations
The study has some limitations. Firstly, the study groups were not statistically matched, which led to a higher internal validity. However, the Student's t-test of baseline characteristics showed no significant differences between the samples. The use of an independent second group could therefore have the advantage of higher external validity in real clinical settings52. Secondly, the position of the transducer was not marked for subsequent intra- and interrater measurements and the exact degree of trunk flexion was not measured goniometrically. Furthermore, it is possible that aLBP patients use different movement and muscle recruitment strategies to reach starting and ending positions. An analysis of surface electromyographic data of the erector spinae muscle at L1 from a previous study showed muscle activity during US measurement in 2 of 20 participants that did not influence group differences in TLFD22. Third, the raters had varying levels of experience with US examination, particularly testing the TLF, and were only familiarized with the method for half an hour at the beginning of the study. Therefore, this reliability study may be of limited use for experimental purposes. The design aimed to mimic and provide data for an easy-to-perform examination in daily clinical practice. There is a high probability that the ICC's and MDC's determined in this trial would be significantly improved by taking the aforementioned points into account, but this was not the focus of this study and is a task for future work.
The US method for assessing TLFD in this study has excellent intrarater and good interrater reliability. It was shown to be able to discriminate aLBP patients from healthy individuals superior to most other assessment methods. The intrarater MDC was below the cut-off point for discrimination. Therefore, this method could be useful in monitoring treatment progress after an intervention. It is promising to explore whether TLFD restrictions could be a potential risk factor for the development of chronic LBP and thus complement diagnostics in the complex multifactorial cascade etiology of this disease.