1. Literature retrieval, literature characteristics and quality assessment
The literature retrieval generated 3956 records. After operated by Endnote X9, duplicate records (n = 295) and inappropriate article types (n = 636) were excluded. During preliminary screening (title and abstract), we eliminated unrelated articles (n = 988) and experimental researches (n = 1967). During repeated screening (full-text), we included articles (n = 9) that met the eligibility criteria [20–28]. A flow diagram of the literature selection is presented in Fig. 1.
In total, 9 articles containing 27 trials were included in the present meta-analysis. We registered characteristics (trial number, first author, year, region, target miRNA, regulation mode, disease, sample size in control and case group, BMI, male proportion and diagnostic SEN, SPE) for each trial and showed in Additional file 1: Tab S1. The results are summarized as follows: Asian trials (n = 11), non-Asian trials (n = 16); Upregulation trials (n = 21), downregulation trials (n = 6); NASH trials (n = 13), NAFLD trials (not distinguishing between NAFL and NASH) (n = 14); BMI ≥ 30 kg/m2 (n = 15), < 30 kg/m2 (n = 9); male proportion ≥ 50% (n = 12), < 50% (n = 8).
The total sample size (including control and case) of the 27 trials was 4036, of which 2764 were in NAFLD trials and 1272 were in NASH trials Fig. 2-A. We summarized the target miRNAs involved in the 27 trials, the top 4 in total simple size were miRNA-122 (n = 1104), miRNA-99a (n = 792), miRNA-34a (n = 642) and miRNA panel (more than one miRNA) (n = 368), respectively Fig. 2-B. We built Cochrane bias graph to assess quality of each included article according to QUADAS-2 questionnaire Additional file 1: Fig S1.
2. Diagnostic Value Of Serum miRNAs In NAFLD
2.1 Pooling all trials (NO.1–27) to evaluate the efficacy of miRNA in diagnosis of the total NAFLD
Significant heterogeneity existed among the trials from the values of sensitivity and specificity
(pooled I2 of SEN and SPE were 94.82% and 88.37%, respectively) Thus, we chose the random-effects model in our study. The pooled values were as follows: SEN 0.72 (95% CI: 0.64–0.79), SPE 0.81 (95% CI:0.75–0.86), PLR 3.77 (95% CI:2.86–4.96), NLR 0.34 (95% CI:0.27–0.44), DOR 10.95 (95% CI: 7.05–17.01), AUROC 0.84 (95% CI:0.80–0.87) Fig. 3. These results indicated that serum miRNA had moderate diagnostic accuracy in the total NAFLD.
Next, we evaluated the efficacy of the most studied serum miRNAs, miRNA-122, miRNA-99a and miRNA-34a, in diagnosis of the total NAFLD. (1) The pooled values of miRNA-122: SEN 0.84 (95% CI: 0.77–0.90) I2 = 80.62%, SPE 0.72 (95% CI: 0.61–0.81) I2 = 85.44%, PLR 3.01 (95% CI: 2.12–4.27), NLR 0.22 (95% CI: 0.14–0.33), DOR 13.79 (95% CI: 7.29–26.06), AUROC 0.86 (95% CI: 0.82–0.89) Additional file 1: Fig S2. (2) The pooled values of miRNA-99a: SEN 0.82 (95% CI: 0.71–0.89) I2 = 93.46%, SPE 0.82 (95% CI: 0.53–0.95) I2 = 96.90%, PLR 4.58 (95% CI:1.30-16.12), NLR 0.22 (95% CI: 0.11–0.47), DOR 20.42 (95% CI: 2.86–146.00), AUROC 0.87 (95% CI: 0.84–0.90) Additional file 1:Fig S3. (3) The pooled values of miRNA-34a: SEN 0.81 (95% CI: 0.76–0.85) I2 = 5.73%, SPE 0.83 (95% CI: 0.77–0.87) I2 = 33.16%, PLR 4.70 (95% CI: 3.51–6.30), NLR 0.23 (95% CI: 0.18–0.29), DOR 20.34 (95% CI: 13.08–31.60), AUROC 0.85 (95% CI: 0.82–0.88) Additional file 1: Fig S4. These results indicated all the three miRNAs had similar moderate diagnostic accuracy in the total NAFLD. Noteworthy, miRNA-34a showed the lowest heterogeneity, thus it might be more suitable to diagnose NAFLD.
2.2 Pooling trials (NO.1–13) and trials (NO.14–27) respectively to evaluate the efficacy of miRNA in the diagnosis of NASH and NAFLD.
(1) The pooled values of NASH trials: SEN 0.74 (95% CI: 0.66–0.81) I2 = 74.97%, SPE 0.85 (95% CI: 0.77–0.91) I2 = 79.60%, PLR 5.01 (95% CI: 3.11–8.05), NLR 0.31 (95% CI: 0.23–0.42), DOR 16.24 (95% CI: 8.17–32.28), AUROC 0.86 (95% CI: 0.83–0.89) Additional file 1: Fig S5.
(2) The pooled values of NAFLD trials: SEN 0.71 (95% CI: 0.58–0.81) I2 = 96.88%, SPE 0.76 (95% CI: 0.68–0.83) I2 = 90.57%, PLR 2.99 (95% CI: 2.24–3.99), NLR 0.38 (95% CI: 0.26–0.55), DOR 7.93 (95% CI: 4.66–13.49), AUROC 0.80 (95% CI: 0.77–0.84) Additional file 1: Fig S6.
These results indicated that serum miRNA had moderate diagnostic accuracy either in NASH trials or in NAFLD trails. Very critically, serum miRNA showed better diagnostic efficacy in NASH than NAFLD for the higher DOR, AUROC and the lower heterogeneity.
2.3 Pooling trials (NO.9–13) to evaluate the diagnostic efficacy in distinguishing between NAFL and NASH
The pooled values were: SEN 0.83(95% CI: 0.70–0.91) I2 = 86.22%, SPE 0.85 (95% CI: 0.74–0.92) I2 = 85.11%, AUROC 0.91 (95% CI: 0.88–0.93). These results indicated that serum miRNA had a high accuracy in discriminating NASH from NAFL Additional file 1: Fig S7.
3. Subgroup Analysis
We divided trials (NO.1–27) into several subgroups according to different categories, and calculated pooled values of each subgroup. The categories were region, type of disease, regulation mode, male proportion and BMI Table 1. The results as follows:
(1) Region. Compared with Asian trials, Non-Asian trials had higher SEN (0.77 vs.0.64) and SPE (0.83 vs. 0.76) and significant lower heterogeneity (SEN I2 81.33% vs. 96.08%, SPE I2 76.03% vs. 92.08%). Non-Asian trials also had higher DOR (17 vs. 6) and AUROC (0.87 vs. 0.77);
(2) Type of disease. Results had already mentioned in Sect. 2.2;
(3) Regulation mode. Compared with upregulated-mode trials, downregulated-mode trials had lower SEN (0.66 vs.0.74) but higher SPE (0.89 vs. 0.78), and showed lower heterogeneity (SEN I 2 59.82% vs. 95.92%, SPE I2 58.57% vs. 90.07%). downregulated-mode trials had higher DOR (16 vs. 10) but the same AUROC with upregulated-mode trials (0.83 vs. 0.83);
(4) Male proportion. Compared with male proportion ≥ 50%, trials with male proportion < 50% had higher SEN (0.77 vs.0.64) and SPE (0.87 vs. 0.74) and significant lower heterogeneity (SEN I2 81.17% vs. 95.58%, SPE I2 58.46% vs. 90.78%). Trials with male proportion < 50% also had higher DOR (21 vs. 5) and AUROC (0.89 vs. 0.75);
(5) BMI. Compared with BMI < 30 kg/m2, trials with BMI ≥ 30 kg/m2 had higher SEN (0.77 vs.0.63) and SPE (0.84 vs. 0.75) and significant lower heterogeneity (SEN I2 82.34% vs. 96.69%, SPE I2 76.67% vs. 93.11%). Trials with BMI ≥ 30 kg/m2 also had higher DOR (17 vs. 5) and AUROC (0.87 vs. 0.76).
In summary, serum miRNA showed more accurate efficacy in the diagnosis of the total NAFLD in such trials: non-Asian, NASH related, woman predominated and BMI ≥ 30 kg/m2. From heterogeneity prospective, all above five factors should be considered as sources of heterogeneity.
4. Meta-regression
To find out the significant source of heterogeneity, we preformed meta-regression. Region, disease type, miRNA regulation mode and miRNA profiling were set as covariates. Due to lack of some data, male proportion and BMI were not be included in meta-regression. We made the assignment as follows: region (Yes = Asian, No = Non-Asian), disease (Yes = NASH, No = NAFLD), regulation (Yes = upregulation, No = downregulation) and miRNA profiling (Yes = single miRNA, No = miRNA panel). As shown in Fig. 4, Only the region factor (Asian trials) caused statistical difference (SEN P < 0.01, SPE P < 0.001), suggesting that region factor (Asian trials) was the significant reason of heterogeneity.
In fact, the heterogeneity of region factor was likely to be mainly due to different BMI: ①In 16 Non-Asian trials, 15 trials with BMI ≥ 30 kg/m2, 1 trials did not provide; In 11 Asian trials, 9 trials with BMI < 30 kg/m2, 2 trials did not provide; ②Asian and non-Asian trials were basically the same in terms of diagnostic criteria, measurement methods, etc.; ③The statistical values of the subgroup categorized by region were very close to that of the subgroup categorized by BMI.
Therefore, although BMI was missing in some trials, we speculated that BMI (< 30 kg/m2) was likely to be the true source of heterogeneity. When successively removed trials (BMI < 30 kg/m2), trials (male proportion ≥ 50%) and trials (NAFLD + miRNA with upregulation mode), the heterogeneity of SEN and SPE showed a downward trend as compared to the original ones: SEN I2 94.82% vs. 81.28% vs.80.39% vs. 0%, SPE I2 88.37% vs. 72.92% vs. 76.92% vs. 25.99% Additional file 1: Tab S2.
5. Clinical Utility
We drew Fagan`s nomogram for NASH trials (NO.1–13) and NAFLD trials (NO.14–27), respectively, as shown in Fig. 5A-B. Pre-test probability was set at 50%. Results as follows: (1) In NASH trials, if a patient obtained a positive result from a serum miRNA test, the probability of suffering NASH was 83%. If the result was negative, the probability of not suffering NASH is 24%; (2) In NAFLD trials, if a patient obtained a positive result from a serum miRNA test, the probability of suffering NAFLD was 75%. If the result was negative, the probability of not suffering NAFLD is 27%. That indicated serum miRNA had higher positive diagnostic value for NASH than NAFLD. Furthermore, a likelihood ratio scattergram was constructed for NASH trials Fig. 5C. The result showed that 9/13 of the trials were located in the right lower quadrant, which represented no exclusion or confirmation, indicating that the serum miRNA test had limited clinical utility for NASH. Thus, continual optimized methods about serum miRNA are needed in the future.
6. Publication Bias
Based on the Deeks’ Funnel plot Fig. 6A-C, publication bias was not detected in the studies where serum miRNA was used to detect the total NAFLD (P = 0.77), NAFLD (P = 0.84) and NASH (P = 0.29). In addition, as shown in Fig. 6D, serum miRNA-34 did not show publication bias where used to detect the total NAFLD (P = 0.46).