Baseline characteristics. The clinical baseline data were presented in Table 1, which included age in years, pregnancy times, and childbirth times, all presented as the mean ± standard deviation. The table revealed no significant differences in age (P = 0.293), pregnancy times (P = 0.452), and childbirth times (P = 0.266) among the three groups of women.
Table 1
Basic information of the participants
| NILM (N = 78) | LSIL (N = 52) | HSIL (N = 26) | P-Value |
Age(years) | 35.60 ± 7.843 | 37.50 ± 9.648 | 37.91 ± 3.22 | 0.293 |
gestation(times) | 2.40 ± 1.761 | 2.04 ± 1.414 | 2.27 ± 1.343 | 0.452 |
production(times) | 1.12 ± 0.581 | 0.94 ± 0.639 | 1.12 ± 0.711 | 0.266 |
Leucorrhea cleanliness(n%) | | | | |
I-II° | 47(60.3%) | 25(48.1%) | 8(30.8%) | - |
Ⅲ° | 26(33.3%) | 16(30.8%) | 9(34.6%) | - |
IV° | 5(6.4%) | 11(21.2%) | 9(34.6%) | - |
H2O2 positive(n%) | 69(88.5%) | 50(89.3%) | 25(96.2%) | - |
Sialidase positive(n%) | 19(24.4%) | 14(8.7%) | 9(34.6%) | - |
Leukocyte esterase positive(n%) | 18(23.1%) | 10(17.9%) | 7(26.9%) | - |
note: There was no statistically significant difference in age (F = 1.237, P = 0.293), gestation (F = 0.798, P = 0.452), and production (F = 1.336, P = 0.266) among the three groups of women (P > 0.05). Leucorrhea cleanliness values of Ⅲ° and IV° degrees indicate poor cleanliness. According to the chi square test, the p-value is 0.029 (p < 0.05). |
Among the patients with HR-HPV infection under study, 93 (59.6%) exhibited monotype infections. The most common type was HPV 16, affecting 28 cases (17.9%), followed by HPV 52 with 12 cases (7.7%), HPV 18 with 10 cases (6.4%), and HPV 58 with 10 cases (6.4%). There were 43 cases (27.6%) with double-type infections and 20 cases (12.8%) with multiple infections. Clinical information regarding leucorrhea for each group is presented in Table 1. A chi-square test conducted on the vaginal microbiota of the three groups indicated statistically significant differences in the cleanliness of leucorrhea (P = 0.029). The progression of cervical lesions in HR-HPV-infected patients was associated with a deteriorating condition, which was further correlated with an increase in positive results for hydrogen peroxide, sialidase, and leukocyte esterase.
Analysis of CVF sample metabolites. A total of 164 metabolites were identified in the CVF samples. The total ion chromatograms (TICs) of the CVF samples, which were obtained using GC-MS, were shown in Fig. 1. The relative standard deviation (RSD) of the internal standard was less than 12%, while the RSD of the quality control (QC) sample was less than 17.2%. The quality accuracy was controlled within 10 ppm, and the variation in retention time was less than 0.1 minute. OPLS-DA models clearly separated the CVF samples into two groups: NILM and SIL (LSIL + HSIL) groups, as well as the LSIL and HSIL groups. The good fit and predictive capabilities of the OPLS-DA model were estimated using R2Y, where the first value represented the fit and the second value represented the predictive capability. A permutation P-value less than 0.01 confirmed the model's reliability and indicated the absence of overfitting, as shown in Figs. 2B and D. Figure 2 indicated that the metabolic patterns in HSIL patients were significantly different from those in NILM and LSIL patients.
Differential metabolites and pathway analysis in cervical lesions. A total of 108 differential metabolites were identified between the NILM group and the SIL group (LSIL + HSIL), with the highest-ranking metabolites including O-phosphothreonine, N-methylglutamic acid, tryptophan, melezitose, glutamine, gamma-aminobutyric acid, alpha-aminoadipic acid, sedoheptulose, N-methylalanine, and spermidine. Detailed information was provided in Table S1.
A total of 31 differential metabolites were identified between the LSIL group and the HSIL group, with the highest-ranking metabolites including sucrose, erythrose, iminodiacetic acid, glucose-6-phosphate, pyrogallol, 2-3-dihydroxypyridine, N-methylalanine, tyrosine minor and erythronic acid lactone. Data was presented in Table S2.
The heatmap results showed that most differential metabolites with low abundances were found in LSIL compared to NILM and HSIL (Fig. 3A). There are 10 different metabolites among the NILM and SIL groups, and the LSIL and HSIL groups (Fig. 3B). After using MetaboAnalyst to analyze the metabolic pathways of the aforementioned 10 potential markers, pathway topology analysis was performed to determine the relevant pathways with an impact value > 0.1. Further pathway analysis also found 5 metabolic pathways that may be related to them, including: (A) starch and sucrose metabolism, (B) phenylalanine metabolism, (C) neomycin, kanamycin, and gentamicin biosynthesis, (D) butanoate metabolism, and (E) TCA cycle (Fig. 3C).
Diagnostic performance of a 10-metabolite panel in cervical lesions. These 10 differential metabolites show significant differences between the LSIL group and the HSIL group (Fig. 4), including the following metabolites: 2,3-dihydroxypyridine, DL-p-hydroxylphenyllactic acid, erythrose, glucose-6-phosphate, gluconic acid lactone, guanine, N-methylalanine, and phenylacetaldehyde, which were increased in the HSIL group, while succinic acid and sucrose were decreased in the HSIL group. Additionally, they show significant differences between the NILM group and the SIL group (Figure S1).
The performance of the 10-metabolite panel for HR-HPV-infected cervical lesions was assessed using the area under the curve (AUC) calculated by the random forest algorithm. The metabolite panel in CVF demonstrated good accuracy in distinguishing between the NILM group and the SIL group (LSIL + HSIL), with an AUC of 0.811 (Fig. 5A), and between the LSIL group and the HSIL group, with an AUC of 0.928 (Fig. 5B).
Phenylacetaldehyde is an aromatic fatty aldehyde primarily derived from the metabolic processes of fatty acid in animal bodies. It has been shown to inhibit cancer cell proliferation and promote apoptosis in these cells. Furthermore, phenylacetaldehyde reduces the expression of cancer stem cell (CSC) marker genes, including CD44+/CD24- and ALDH1. It also preferentially induces the production of reactive oxygen species (ROS), decreases the phosphorylation of nuclear Stat3, and lowers IL-6 secretion, thereby inhibiting the activation of the Stat3 signaling pathway. Because of these properties, phenylacetaldehyde is considered a potential therapeutic agent for cancer and cancer stem cells21. In our study, we observed that phenylacetaldehyde levels were increased in the HSIL group, suggesting a protective compensatory mechanism in cervical lesions.
Glucose-6-phosphate is a product formed by the phosphorylation of various sugars that are metabolized into glucose after entering cells25. In our study, we observed an upregulation of glucose-6-phosphate and a downregulation of sucrose, indicating that glucose metabolism is enhanced in cervical lesions. This increased metabolism may contribute to viral replication and exacerbate the severity of the lesions26–27.
Succinic acid is a metabolic intermediate of amino acids in the tricarboxylic acid (TCA) cycle within host cells. It can be metabolized by somatic cells, and current research indicates that succinic acid and its derivatives exhibit anticancer activity by inducing cell apoptosis22–24. In our study, we found that succinic acid levels were decreased, which suggests that high-risk HPV infection leads to increased cellular proliferation. This proliferation may exacerbate disease progression and raise the potential for further advancement toward cervical cancer.
Analysis of the bacterial community in cervical lesions. We analyzed 156 CVF samples to assess the bacterial composition in the vagina. Alpha diversity was compared using dilution curves, showing a steeper curve in the HSIL group compared to the others, while the LSIL group had a gentler curve than the NILM group, indicating that vaginal microbiota diversity may initially decrease before significantly increasing (Fig. 6A). Using the Sobs index, we found significant differences in alpha diversity between the HSIL group and the NILM and LSIL groups (P < 0.05). Microbial diversity slightly decreased from NILM to LSIL (P > 0.05) but significantly increased from LSIL to HSIL (P = 0.0042) and from NILM to HSIL (P = 0.0182) (Fig. 6B). Beta diversity analysis via principal coordinate analysis (PCoA) showed distinct separation of the HSIL group from the others based on binary Jaccard distance (ANOSIM test, P < 0.05), with no significant difference between the NILM and LSIL groups (P > 0.05) (Fig. 6C, D).
At the genus level, we identified 152 microorganisms that were common across the three groups. Lactobacillus was the most abundant genus, followed by Gardnerella. As the severity of cervical lesions progressed, the abundance of Atopobium and Sneathia increased, while Streptococcus decreased (Fig. 7A, C). The NILM group exhibited 149 unique species, the LSIL group had 86, and the HSIL group had 73 unique species (Fig. 7B). The Vaginal Microbial Health Index (VMHI) serves as a robust measure for evaluating health status based on species-level classification features of vaginal microbiome samples. Our analysis revealed that the HSIL group had significantly lower VMHI scores compared to the LSIL group, and there was also a notable decrease in VMHI scores in the SIL group compared to the NILM group (Figure S2).
This study found a close association between cervicovaginal microbiota and cervical cancer in patients with persistent HR-HPV infection. The cervicovaginal environment is home to various bacteria, with Lactobacillus species frequently recognized as biomarkers of health28. These bacteria play a crucial role in maintaining a balanced micro-ecosystem, which can be disrupted by changes in the host, leading to inflammation and viral infections. HPV infection and clearance rates are correlated with the presence of Langerhans cells and a microbiome dominated by Lactobacillus geeseri29,13. Additionally, Lactobacillus enhances the immune response by stimulating phagocytes and promoting the production of cytokines, including IL-10, IL-12, IFN-γ, and TNF-α30. A 16-week cohort study revealed that HPV-positive women had a vaginal microbiota predominantly composed of Lactobacillus iners and anaerobic bacteria13. Notably, Lactobacillus indolent may hinder HPV clearance, as women with this species experience lower clearance rates compared to those with Lactobacillus crispatus20.
A study has shown that in women lacking Lactobacillus crispatus, the competitive proliferation of three bacterial groups—Lactobacillus iners, Gardnerella vaginalis, and Anaerococcus vaginalis—can lead to the development of squamous intraepithelial lesions (SIL)31. Additionally, some research indicates that Lactobacillus may have a plasmid-like function that allows it to integrate the HPV16 gene, thereby preventing the virus from integrating with host cells32. The loss of Lactobacillus dominance promotes the colonization of anaerobic bacterial species, increasing microbial diversity. This shift often results in changes to immunity and epithelial homeostasis through various mechanisms, which can, in turn, facilitate HPV infection18.
While the vaginal microbiota of most HPV-infected women primarily consists of Lactobacillus and Gardnerella, the alpha diversity of vaginal microbiota tends to decrease initially with increasing cervical lesion grade before significantly rising. The biological diversity of vaginal microecology in patients with high-grade squamous intraepithelial lesions (HSIL) is markedly different from that in patients with low-grade squamous intraepithelial lesions (LSIL). As the grade of cervical lesions increases, the abundance of Atopobium and Sneathia rises, while Streptococcus abundance decreases. These changes in microbial composition may affect the metabolic capacity of the cervical and vaginal microenvironment.
Correlation between vaginal microbiota and metabolites. We analyzed the correlation between 10 differentially expressed metabolites and the top 50 bacterial populations based on abundance. Metabolites were screened using variance inflation factor (VIF) analysis, excluding those with a VIF greater than 10 due to weak correlations. The correlation coefficients between the selected metabolites and the top 50 abundant bacterial species were computed through heatmap analysis, revealing significant associations (Fig. 8). Sucrose positively correlated with Flavobacterium, a Gram-negative bacterium that can produce acid from sugar, potentially causing inflammation when the immune system is weak33. Phenylacetaldehyde was positively correlated with Pseudomonas, where phenylacetaldehyde dehydrogenase converts it to phenylacetic acid in the detoxification pathway34. Succinic acid showed positive correlations with Klebsiella, DNF00809, and Sneathia, but negative correlations with Gardnerella and Veillonella. Gardnerella, a key pathogen in bacterial vaginosis (BV), increases succinic acid concentrations, contributing to inflammation35. Glucose-6-phosphate positively correlated with Shuttleworthia and Dialist, while negatively correlating with Aerococcus and Peptoniphilus. The consumption of key amino acid metabolites indicates disrupted cell metabolism in HPV-positive and SIL patients, suggesting potential HPV-induced cervical-vaginal malnutrition36. This study observed an increase in the expression of Lactobacillus-related metabolites, such as DL-p-hydroxyphenyllactic acid, as well as sugar metabolism-related enzymes like erythritol and glucose-6-phosphate in SIL. This indicates that glycolysis is already enhanced at the SIL stage, potentially contributing to the development of cervical cancer. Additionally, increased carbohydrate consumption in cervical cancer tissue can lead to elevated lactate production. When oxygen levels are sufficient, cells tend to generate energy through glycolysis rather than oxidative phosphorylation. This mechanism allows cancer cells to rapidly produce adenosine triphosphate (ATP), providing a quick energy source. Furthermore, the intermediate products of glycolysis support the material synthesis needs of rapidly proliferating tumor cells26.