3.1. Study selection and characteristics
After the full-text assessment, 45 publications were included in the systematic review (see Figure 1 for the PRISMA flowchart). The authors were contacted when we could not retrieve or understand the data and only two out of ten responded to the request and sent the raw data. Table S4 presents characteristics of the included studies (the complete list is available in Supplementary File 1). Overall, study characteristics varied considerably; most studies were performed in rats (31 studies; 69%), four in mice (8%), nine in rabbits (20%), and one in dogs. Eight studies did not report the sex of the animal, while 24 (53%) and 13 (28.8%) used only male or female animals respectively, and none used both sexes. Eleven studies (24.4%) used a selective COX-2 NSAID as an experimental intervention drug, 22 (48.8%) and 12 (26.6%) used a non-selective or both NSAID types, respectively. There was a great variability on the primary outcomes for biomechanical characteristics; 35 (77.7%) studies reported one or more biomechanical characteristics, and ten (22.2%) studies did not report any of these characteristics.
3.2. Risk bias and quality of the studies
The assessment results for risk of bias and quality of reporting related to randomization, blinding, sample size calculation, and time of day for NSAID administration or surgery are summarized in Figures 2 and 3. Scores for each study are presented in Supplementary File 2. Among the 45 included studies, 28 (62.2%) mentioned the term “randomization” at any step in the study, but no article provided details on the method used. Only 12 (26.6%) studies reported blinding which for most of them was on the histological outcome assessment. Among all included studies, only five (13.3%) reported a sample size calculation; no article specified the time of day at which NSAID was administered or the time that surgery was performed (day or night). Due to poor reporting, many items evaluating the risk of bias on the assessment tool showed an unclear score. For example, “selective outcome reporting bias” was assessed as unclear for all studies because none reported using a research protocol defining primary and secondary outcomes.
3.3. Meta-analysis of NSAID administration during fractured bone healing
Due to missing information in 9 studies regarding outcome data, or not suitable outcome measurement, or the intervention is not fracture (6, 42-49), we included thirty-six studies in the meta-analysis. Thirty-two studies compared the effect of administration of one or more NSAID on biomechanical characteristics (e.g., maximum force (MF) to fracture, stiffness, and work-to-failure) to a control group. For three-point mechanical bending properties, the analysis includes 186 experiments covering different animal models, NSAID types and measurement time points. Four and seven studies were included in the analysis of the effect of NSAID administration on the µ-CT and histological assessment healing outcomes, respectively. The average timing of data collection after bone fracture to assess the mechanical bending maximum force of healing bones was an average of 29.6 days (minimum, 5 days; maximum, 84 days).
3.4. Biomechanical assessment
Results from thirty studies including 94 comparisons showed that the maximum force to fracture was significantly decreased, indicating bone healing delay, in animals that received an NSAID after bone fracture compared to the control group (SMD -0.58, 95%CI [-0.74,-0.42]; Table 1). Heterogeneity was moderate (I2, 55.04%). Similarly, animals that received NSAID had an overall decrease in bone stiffness and work-to-failure properties (SMD -0.56 [-0.76,-0.37] and SMD -0.58 [-0.95,-0.20]) respectively compared to controls (Figure 4; Table 1). Between-study heterogeneity was moderate for both stiffness (I2, 60.41%) and work to failure outcomes (I2, 56.29%).
We explored the sources of heterogeneity by examining the effect sizes in predefined subgroups: animal sex, age and species, time of bone collection and type of fractured bone. While animal age and type of bone were source of heterogeneity for the maximum force to break, time of sample collection and animal sex, age, and species were for the stiffness analysis. Moreover, sex and time of sample collection were sources of hetrogeniety in the work to failure analysis (Table 1).
Table 1 shows the subgroup analysis for three-point mechanical bending measurements. For maximum force measurement, NSAID administration did not delay bone healing among mice (SMD -0.28 [-0.68, 0.10]) but did it in other animals. In addition, we observed a difference in this measurement for the subgroup analysis of bone model; while femur (SMD -0.68 [-0.88, -0.48]) showed a significant difference between NSAID and control, tibia did not (SMD -0.19 [-0.49, 0.10]). Moreover, when comparing SMD across bone models, the effect of NSAID administration was significantly larger in femur compared to tibia (P=0.007; Figure 5b).
Bone stiffness among mice (SMD -0.07 [-0.55, 0.40]) and animals older than 16 weeks (SMD -0.31 [-0.82, 0.18]) did not differ between NSAID and control groups (Table 1). However, compared to controls, bone healing was better in mice taking NSAIDs than in rabbits (p = 0.01; Figure 6a). The effect of NSAID administration was significantly different when the bone samples were harvested before 21 days compared to other time points between 21 to 48 days after surgery (P=0.03; Figure 6c).
Regarding work to failure, there was no significant effect of NSAID administration on bone healing in the groups of female animals (SMD -0.09 [-0.55, 0.36]), those that received Selective-cyclooxygenase2 NSAID (SMD, -0.58 [-1.26, 0.09]), and for the femur bone model fracture (SMD, -0.38 [-0.83, 0.05]) (Table 1; Figure 7).
3.5. Micro-computed tomography assessment (bone assessment)
We included five comparisons from four studies that measured healing bone using a µ-CT scan in the meta-analysis. The average time of bone collection after animal euthanasia was 19.5 days (range, 17–21 days). Figure 8 and Table 2 show the distribution of the data. Although the subgroup analyses were not performed because the number of comparisons was small, the overall analysis shows a significant difference in bone volume measurements for animals that received NSAID compared to controls (SMD, -1.63 [-2.87, -0.39]), but this was associated with high heterogeneity among the studies (I2 83.32, p <0.001).
3.6. Histomorphometric assessment
Seven studies including 33 experimental comparisons between NSAID administration and a control group showed no significant difference in all three (callus size, cartilage, and bone tissue) histomorphometric measurements (SMD, -0.16 [-0.49, 0.17], I2 = 54.64). Animal models and types of fractured bones were sources of heterogeneity for histomorphometric measurements of healing bones among studies (Figure 9; Table 3). Interestingly, no mouse model was used to study the histomorphometric measurements related to bone, cartilage, or callus size. Rat models, and histomorphometric evaluation at less than 21 days showed that bone healing was delayed in the NSAID group compared to controls (Table 3).
Moreover, when comparing SMD across animal species, bone models, and time of collection, the effect of NSAID administration was significantly larger in rats compared to rabbits (P=0.01; Figure 10a), in femur compared to fibula (P=0.02; figure 10b), and in the groups of bone samples that have been harvested less than 21 days (P=0.03; Figure 10c) after surgery.
3.7. Publication bias
The possible presence of publication bias was observed when assessing the histomorphometric outcome measurements. The inspection of the funnel plot suggested asymmetry resulting from the underrepresentation of studies that show negative effect of NSAID (Figure 11). Trim and fill analysis resulted in five extra data points, indicating the presence of publication bias.