3.1 Comparison of General Characteristics
The primary source of pulmonary metastatic carcinoma were rectal cancer (n=7), colon cancer (n=4), lower limb osteosarcoma (n=1), nasal cancer (n=1), breast cancer (n=2) and liver cancer (n=1). Postoperative pathological types in benign pulmonary nodule group included lymphadenopathy (n=11), tuberculous granulomatosis (n=4), sclerosing alveolar cell tumor (n=2), organizing pneumonia (n=5), hamartoma (n=3), leiomyoma (n=2), solitary fibroma (n=1), fungal infection (n=1), atypical adenomatous hyperplasia (n=2) and epithelioid hemangioendothelioma (n=1).
The general characteristics, including smoking history, comorbid disease, history of malignant tumors, neoadjuvant therapy, were compared between the Discovery Set and Validation Set of primary lung cancer. There is no significant difference in all comparisons (p<0.05). In this study, the pathological type of primary lung cancer include six kinds: adenocarcinoma (n=131, 81.9%), squamous cell carcinoma (n=20, 12.5%), small cell lung cancer (n=4, 2.5%), carcinoid (n=3, 1.9%), carcinosarcoma (n=1, 0.6%), large cell lung cancer (n=1, 0.6%). The six types of pathology in the Discovery Set and Validation Set have no significant difference (p=0.669). The pathological stages of primary lung cancer include stage 0 (n=21, 13.1%), stage I (n=103, 64.4%), stage II (n=10, 6.3%), stage III (n=23, 14.4%), and stage IV (n=3, 1.8%). There was no significant difference in the pathological stages between the Discovery Set and the Validation Set (p=0.526). (Table S2).
3.2 Lung Cancer Screening In Healthy People
First compare the four groups of the discovery set as a whole, and obtain the overall difference among them. Then the healthy population group was compared with the other three groups, respectively. The differences in plasma metabolic profiles between them were analyzed, and the low-molecular metabolites that cause these differences were selected.
3.2.1 Overall comparison of the four groups
The data of four groups were normalized by MetaboAnalyst 4.0. The comparison of data before and after normalization is shown in Figure S2. The levels of some low-molecular metabolites in four groups were obviously different overall (Figure 3A and Figure S3). According to the VIP value and p value, five low-molecular metabolites were selected. Their intensity values in each group can be intuitively reflected in Figure 2. After non-parametric tests, the p values obtained were all less than 0.001, indicating that the level of the five metabolites in the four groups were significantly different.
3.2.2 Comparison of the healthy group with the three pulmonary nodule groups respectively
Three pulmonary nodule groups including PMC, BPN and PLC were compared with HPG respectively. The clear separation of two groups in each comparison can be seen(Figure 3B). The major low-molecular metabolites were selected in each comparison(Table 1).
In order to test the ability of these metabolites to discriminate between the healthy people and the pulmonary nodule lesions, these major low-molecular metabolites were drawn into ROC curves (Figure 4). The area under the curve (AUC) of every metabolite was greater than 0.9, indicating that these low-molecular metabolites all have a high discriminating ability. The optimal critical point of the ROC curve, and its corresponding sensitivity and specificity are shown in Table 1.
3.3 Differential Diagnosis of the Pulmonary Nodules
According to the pathology, the common pulmonary nodules were mainly divided into three types: primary lung cancer, benign pulmonary nodules, and pulmonary metastatic carcinoma. To help determine the nature of lung nodules before surgery, the three types of pulmonary nodules were compared by the means of plasma metabolomics.
3.3.1 Overall comparison of the three pulmonary nodule groups
The PMC, BPN and PLC were compared as a whole (Figure 3C). The validity of the model was further tested by permutation test. The number of tests was set to 200. The test result showed that the validity was good (Figure S4). The major low-molecular metabolites were selected from 27 metabolites (Figure S5): anabasine, octanoylcarnitine, 2-methoxyestrone, retinol, decanoylcarnitine, calcitroic acid, glycogen and austalide L.
3.3.2 Pairwise comparisons of the three pulmonary nodule groups
The PMC, BPN and PLC were compared in pairs. Firstly, the BPN was compared with PLC(Figure 3D), and the difference between the two groups was obvious. The main low-molecular metabolites selected from 26 metabolites were octanoylcarnitine, decanoylcarnitine and PGF2a ethanolamide. The difference of the three metabolites between the two groups can be seen directly in Figure 5A. ROC curves were drawn for these metabolites, and the AUC were 0.974, 0.965, and 0.881, respectively. The optimal critical points for the best sensitivity and specificity were 6.85, 3.50, and 6.89, accordingly (Table 1). Secondly, PMC and BPN were compared. Tyrosine, indoleacrylic acid and LysoPC(16:0) were selected. Their ROC curves and box plots were shown in Figure 5B. At last, the PMC was compared with PLC. The metabolites selected were octanoylcarnitine, retinol, and decanoylcarnitine. Figure 5C showed the obvious difference of these three metabolites between the two groups. Detailed information was shown in Table 1.
3.4 Validation of Metabolites in Primary Lung Cancer and Healthy People
A total of six major different low-molecular metabolites were found in the Discovery Set (Table 1). The ROC curves of these metabolites were drawn in the validation set and their AUC were all greater than 0.95, indicating that these six metabolites have a strong ability to differentiate primary lung cancer from healthy people (Figure 4D).
3.5 Further Analysis of Primary Lung Cancer
In this study, the pathological types of primary lung cancer included adenocarcinoma, squamous cell carcinoma, small cell lung cancer, carcinoid carcinoma, carcinosarcoma, and large cell lung cancer. All samples were grouped according to different pathological types. After multivariate statistical analysis of the data of each group, it was found that there was no significant difference in the spatial distribution of the fractional scatter plots of the six groups of samples. This indicates that there is no significant difference in the metabolic level of low-molecular metabolites among samples of different pathological types of lung cancer. The same operation and analysis were performed in the primary lung cancer of the Discovery Set and Validation Set respectively, and the same results were obtained.
All the samples of primary lung cancer were grouped according to different postoperative pathological stages (stage 0, I, II, III, IV). After multivariate statistical analysis of the data of each group, it was found that there was no significant difference in the spatial distribution of the fractional scatter plots of the five groups of samples. This indicates that there is no significant difference in the metabolic level of low molecular metabolites in the samples of different pathological stages. The same operation and analysis were performed in the primary lung cancer of the Discovery Set and Validation Set respectively, and the same results were obtained.