The circRNA expression profile in LSCC and control group.
The circRNA microarray was further employed. Each group contains three samples. Hierarchical clustering analysis and volcano plot distribution were used to sort the aberrantly expressed circRNAs in different groups. As presented in Fig. 1a, we obtained a different expression level of circRNA in each group. We used the following parameters for further screening: i, P value <0.05; ii, CT value <35; iii, detection rate >75%. A total of 122 circRNA transcripts were specifically increased in LSCC group comparing with control group while 98 circRNAs was proved with downregulated (Fig. 1b). In order to reveal the potential biomarker for LSCC, we selected the top 20 increased circRNAs in LSCC group as candidate diagnostic makers (Fig. 1c).
Validation of significantly dysregulated circRNAs in plasma
Two staged validation of candidate circRNAs was used including 20 paired samples as training set and 100 paired samples as validation set. A randomly selected 20 paired LSCC and control samples was used. The top 20 circRNAs was first validated. Among the 20 candidate circRNAs, 10 circRNAs was confirmed with significant different expression; however, 5 circRNAs was consistent with microarray results while the rest circRNAs was presented with decreased level in LSCC group.
Based on the results in training set, the rest 100 paired samples were enrolled as validation set. We next examine the expression of the five candidate circRNAs in validation set. We found that hsa_circ_0055202, hsa_circ_0074920 and hsa_circ_0043722 was confirmed with higher expression level in LSCC, while hsa_circ_0010178 presented no significant. The hsa_circ_0009760 presented a higher expression in lncRLSCC; however, the p value was <0.05.
Two-stage validation of candidate circRNAs was performed, with 20 pairs of paired samples as the training set and 100 pairs as the validation set. As shown in Fig. 2a, 20 pairs of LSCC and control samples were randomly selected. The top 20 circRNAs were identified for the first time. Among the 20 candidate circRNAs, three circRNAs entitled circ_0019201, circ_0011773 and circ_0122790 were proved to be differentially expressed and consistent with the microarray results. However, another four circRNAs were inconsistent with the microarray results, and the remaining circRNAs presented no difference between LSCC group and control group (Fig. 2b).
Based on the results of the training set, the remaining 100 paired samples were enrolled in the verification set. We then studied the expression of the circRNAs validation set of the three candidates. As shown in Fig. 3, we found that circ_0019201, circ_0011773 and circ_0122790 had higher expression levels confirmed by LSCC. Furthermore, the expression of the three circRNA presented a remarkable decreasing level after surgical excision.
Risk score analysis
In order to further explore the accuracy and specificity of these three circRNAs as potential signatures, risk scoring formulas were used to evaluate the diagnostic value of the 3 circRNAs. Firstly, we divided the control and case groups in the training set according to the upper 95% confidence interval (95% CI) of the control group. Logistic regression analysis was used to calculate the risk score. All plasma samples were then divided into a high-risk group (possibly LSCC) and a low-risk group (predicted to be a control group). We define the cutoff value as the maximum of sensitivity + specificity. The positive predictive value (PPV) and negative predictive value (NPV) calculated in the training set were 90% and 85% respectively. We further applied the same values to calculate the risk score for the validation set sample, with PPV and NPV of 90% and 89%, respectively (Table 2). In addition, we also used ROC curve analysis to evaluate the predictive diagnostic value of circRNA for LSCC. In the test set, the areas under the ROC curve of the three circRNAs were 0.933, 0.908 and 0.965, respectively, and the combination of the three had a good ability to distinguish the areas under the ROC curve of the LSCC patients from the control group. In the validation set of enlarged samples, the areas under the ROC curve of the three circRNAs were 0.766, 0.864 and 0.908, respectively. Combined with the three circRNAs, the area under the ROC curve of the LSCC patients and the control group was 0.951 (Fig. 4).
Stability detection of circRNAs in plasma samples
We next amplified the three circRNAs in five healthy controls. We incubated human plasma obtained from three healthy controls at room temperature for 0h, 12h and 24h, treated with frozen thawing for 5 cycles, under storage of -80 °C for about 7 days and digesting with RNAse. We found that neither the expression level of the three circRNAs were alternated indicated that circ_0019201, circ_0011773 and circ_0122790 were stably expressed and detectable in human plasma (Fig. 5).