The research protocol was approved by the Hunan Xiangya Stomatological Hospital, Central South University, Institutional Review Board (20240030).The full name of the GEO database (https://www.ncbi.nlm.nih.gov/geo/info/datasets.html) is GENE EXPRESSION OMNIBUS, which is a gene expression database created and maintained by the National Center for Biotechnology Information, NCBI. Created in 2000, high-throughput gene expression data submitted by research institutions around the world were collected. The Series Matrix File data file of GSE215403 was downloaded from the NCBI GEO public database, and a total of 12 cases of oral squamous cell carcinoma gingival and buccal tissue expression profile data were included.
2. Single Cell Analysis
First, the expression profile is read through the Seurat package, and low-expression genes (nFeature_RNA > 200 & percent.mt < 10 & nCount_RNA < 200000) are filtered out. Then, standardization, normalization, PCA, Harmony, and UMAP analysis are performed on the data, and ElbowPlot is used to determine the optimal number of pcs, obtain the positional relationship between each cluster through UMAP analysis, and annotate the clusters. These genes are annotated to some cells that are important in the occurrence of the disease. Finally, we extracted marker genes for each cell subtype from the single-cell expression profile by setting the logfc.threshold parameter of FindAllMarkers to 1.
3. Cell communication
By integrating intracellular and intercellular signals, CellCall is a toolkit for inferring intercellular communication networks and internal regulatory signals. It collects ligand‒receptor‒transcription factor (L‒R‒TF) axis datasets on the basis of the KEGG pathway. On the basis of prior knowledge of L-R-TF interactions, CellCall binds the expression of ligands/receptors downstream of certain L-R pairs. TF activity to infer intercellular communication.
4. GSVA (gene set difference analysis)
Gene set variation analysis (GSVA) is a nonparametric, unsupervised method for assessing gene set enrichment. GSVA converts gene-level changes into pathway-level changes by comprehensively scoring the gene set of interest and then determining the biological function of the sample. In this study, gene sets were downloaded from the Molecular Signatures Database, and the GSVA algorithm was used to score each gene set comprehensively to evaluate potential biological function changes in different samples.
5. GSEA
According to the expression of SPINK1 in patients, patients were divided into high- and low-expression groups, and GSEA was used to further analyse the differences in signalling pathways between the two groups. The background gene set is the version 7.0 annotated gene set downloaded from the MsigDB database. As an annotated gene set for subtype pathways, differential expression analysis of pathways between different groups was performed. Significantly enriched gene sets (adjusted p values) were analysed on the basis of the consistency score. less than 0.05) for sorting. GSEA is often used in research that closely combines disease classification with biological significance.
6. Cell Lines, Cell Culture Conditions, and Transfection
Human oral keratinocytes (HOKs) and the HNSC cell lines SCC9, SCC25, HN4, HN30, and CAL27 were procured from the China Center for Type Culture Collection. These cells were cultured in Dulbecco's modified Eagle's medium (DMEM; Biological Industries, Israel, C3120-0500) with 10% heat-inactivated fetal bovine serum (Biological Industries, Israel, 04-001-1C), 100 units/mL penicillin, and 100 g/mL streptomycin, maintained at 37°C in a humidified 5% CO2 atmosphere. The shSPINK1 plasmid (GeneCopoeia, China, XM_017009906) was constructed using the GV248 vector. A SPINK1 overexpression vector (GV658, CMV enhancer-MCS-polyA-EF1A-zsGreen-sv40-puromycin) was created based on the SPINK1 reference sequence (NM_001379610.1) from the National Biotechnology Information Center database. Transient transfection was performed using Lipofectamine 3000 (Invitrogen, USA, L3000015) following the manufacturer's instructions. The transfected cells were cultured and harvested for subsequent assays.
7. RNA extraction and real-time polymerase chain reaction (PCR) analysis
Total RNA was extracted using TRIzol Reagent, and cDNA was synthesized with the HiScript II Q RT SuperMix for qPCR Reagent Kit (Vazyme, China, R123-01). Real-time PCR was conducted with ChamQ Universal SYBR qPCR Master Mix (Vazyme, China, Q711-02/03). GAPDH served as the control, and the PCR products were quantified and normalized. The PCR primers for the SPINK1 gene were as follows:
forward: 5'-ATAGGATCCGCCATGAAGGTAACAGGCATC-3',
reverse: 5'-GGCGAATTCGCAAGGCCCAGATTTTTGAAT-3'.
8. Western blot analysis
Cells were collected when they reached 80%-90% confluence. Protein samples (30–40 µg) were separated using 6% or 8% sodium dodecyl sulfate-polyacrylamide gel electrophoresis and transferred to 0.45 µm PVDF membranes (MERCK, Germany, IPVH00010). After blocking with 4% BSA at room temperature for 1 hour, the appropriate primary antibodies were added, and the samples were incubated overnight at 4°C. The primary antibodies used for Western blot analysis included anti-SPINK1 (Bioss, China, bs-1385R, 1:1000) and GAPDH (Bioss, China, bs-13282R, 1:1000). Secondary antibodies labelled with IR Dye (Bioss, China, bs-40295G-IRDye8, 1:10000) were used. The signals were detected via an Odyssey Infrared Imaging System (Tanon, China).
9. Cell Proliferation Analysis
Following transient transfection with SPINK1 expression vectors, tumor cells were seeded in triplicate at a density of 1000 cells per well in 96-well plates. Cell proliferation assays were conducted via the Cell Counting Kit-8 (Dojindo, Japan, CK04). At 0, 12, 24, 48, and 72 hours postseeding, the growth kinetics of the tumor cells were monitored via a microplate reader system (Tanon, China).
10. Wound-healing Assay
Tumor cells were grown in 6-well plates and transfected or pretreated according to the protocol until they were fully confluent. Once they reached 90%-100% confluency, the cells were serum-starved for 24 hours. They were then manually scratched with a 10 µL pipette tip (time, 0 h), washed twice with 2 mL of PBS for 5 minutes each, and incubated in serum-free DMEM. At 24 hours postscratching, five nonoverlapping fields were randomly imaged. The migration boundaries of the tumor cells were automatically determined via ImageJ software (NIH, USA, v1.8.0.172), and the boundaries with the highest migratory activity were manually selected and highlighted with red lines.
11. Transwell Migration and Invasion Assay
For the migration assay, Transwell chambers with a 0.8 µm pore size (Corning, USA, 3422) were used, while invasion assays employed Transwell chambers coated with Matrigel on the upper surface (BD, USA, 356234). Migrated or invaded cells were fixed, stained with crystal violet, and images from five randomly selected areas were captured. Cell counts were conducted using Image-Pro Plus software (Media Cybernetics).
12. Statistical analysis
Statistical analysis of the GEO data was performed with R software (version 4.2.2), using a significance level of p < 0.05. GraphPad Prism 7.0 was employed for analyzing and interpreting experimental data. Student’s t-test was used to compare differences between two groups, with p < 0.05 indicating significant differences.