Data resources and workflow
Data for constructing and validating prognostic models were obtained from the TCGA and GEO databases. The TCGA-UVM dataset was used to construct the prognostic models and contains bulk RNA-seq data and clinical information for 80 UM patients. Another cohort GSE22138 include 63 UM patients were chosen from the GEO database (https://www.ncbi.nlm.nih.gov/geo/). Baseline clinicopathological information, metastatic status, and final clinical outcome were recorded for each patient. The two single-cell transcriptome sequencing datasets used to probe mechanisms were obtained from the GEO database. The single-cell dataset GSE138665 contains tumor single-cell transcriptome data from 6 primary UM patients. The single-cell database GSE139829 contains tumor single-cell transcriptome data from 8 primary UM patients. The flowchart of this study is showed in the Fig. 1.
Metastasis related genes identification and functional enrichment analysis
After removing the 1 sample with missing transfer status, the remaining 79 samples from the TCGA-UVM dataset were divided into two subgroups according to transfer status, and subsequently the R packages "DESeq2", "edgeR" and "limma" to assess the differential mRNA expression between the metastatic and non-metastatic groups. The criteria for selecting metastasis related genes (MRGs) included |log2(fold change)| >Mean [log2(fold change)] + 2SD[log2(fold change)]and a false discovery rate (FDR)-adjusted p < 0.05.The MRGs that were co-upregulated or co-downregulated in the three R packages were considered to be metastasis-associated genes. The results were visualized using the R packages "ggplot2" and "MetaVolcanoR".
To explore potential biological processes related to the obtained MRGs, we performed gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG)enrichment analysis using the R package "clusterProfifiler". The GO enrichment analysis was conducted based on three aspects including biological process (BP), molecular functions (MF) and cellular components (CC). The results were visualized using R packages "ggplot2" and "BioEnricher".
Survival analysis and prognostic modeling
Survival was estimated by the Kaplan-Meier method using MGRs, and any differences in survival were evaluated with log-rank test and univariate Cox regression. Genes with p-values < 0.05 for both tests were considered to be associated with prognosis. Least absolute shrinkage and selection operator (LASSO) regression analysis was then performed using the R package "glmnet" to avoid overfitting. Subsequently, multivariate Cox regression analysis (significance level: p < 0.05) was performed to establish a prognostic model. According to the corresponding regression coefficient calculated by multivariate Cox regression analysis, the risk score can be calculated. The risk score was computed as follow:\(\:\text{R}\text{i}\text{s}\text{k}\text{s}\text{c}\text{o}\text{r}\text{e}=\sum\:_{i=0}^{n}(\:{Coef}_{i}\:\times\:{Exp}_{i})\)
Coefi is the corresponding regression coefficient calculated by multivariate Cox regression analysis and Expi is the expression value of the genes selected by multivariate Cox regression analysis.
Evaluation of prognostic model
The TCGA-UVM dataset was used as the internal validation dataset GSE22138 was used as the external validation dataset, and the samples were divided into high-risk and low- risk groups according to the survival model, respectively. Kaplan-Meier survival analyses were performed using the R packages "survival" and "survminer" to compare the OS and MFS between different groups. R package "timeROC" were used to establish time-dependent receiver operating characteristic curves (time-ROC) to evaluate the predictive power of the model. The larger the area under the curve (AUC), the stronger the predictive ability of the risk model.
Cell culture
The human UM cell line MuM-2B were purchased from iCell Bioscience Inc. (iCELL-h148; Shanghai, China). Cells were cultured in RPMI- 1640 medium (Gibco, USA) supplemented with 10% FBS (Gibco, USA) and 1% penicillin–streptomycin in an incubator containing 5% CO2 at 37°C. The cell experiments were performed in the logarithmic growth period.
Quantitative real-time PCR
Total RNA from MUM2B cells were extracted using the SteadyPure Quick RNA Extraction kit (AG21023; Accurate Biotechnology, Beijing, China). And the reverse-transcribed using EVo M-MLV RT Mix Kit with gDNA Clean for gPCR Ver.2( AG11728; Accurate Biotechnology, Beijing, China) for qPCR. The resulting cDNAs were used for PCR using the SYBR® Green Premix Pro Taq HS qPCR Kit ( Rox Plus )(AG11718-S, Accurate Biotechnology, Hunan). qPCR was performed using the StepOne Real-Time PCR system(4376 600; Applied Biosystems, Carlsbad, CA, USA). The PCR program was as follows: pre-denaturation at 95°C for 30 s, and 40 cycles of 5 s at 95°C and30 s at 60°C. The following primer sequences (all 5′ to 3′) were used for quantita-tive real-time PCR: EDNRD Forward: 5’- CCATTGGCCATCACTGCATTT − 3’, Reverse: 5’-:CCGTCTCTGCTTTAGGTGATCAT − 3’; LURAP Forward: 5’- CATGGAAACCTGGCTCATCA − 3’, Reverse: 5’-GCTCTCTATCTTGGGTAACAGCA − 3’༛SLC25A38 Forward: 5’- GGGTCTCCTTGGGTATGTTCTT − 3’, Reverse: 5’- CTCCTCTCCTCACATCCAGTCTT − 3’༛All data were normalized against endogenous glyceraldehyde 3-phosphate dehydrogenase (GAPDH) controls of each. sample Relative expressionwas calculated using the 2 − ΔΔCT method. Five biological replicates were included for each assay with three technical repli-cates for each biological replicate.
Transfection experiment
The siRNA mimics (si- NC, si-EDNRB, si-LURAP1, si-SLC25A38) and negative control (si-NC) were transfected into MUM-2B cells using a RiboFECT™ CP Transfection Kit (RiboFECT CP, Guangzhou, China), Transfection was carried out according to the manufacturer’s instructions.
Wound healing assay
MUM2B cells were seeded into 6-well plates with a density of 1.0 × 105 cells per well, and they adhered to the wall of the dish after incubating for 24 hours. As described above, add corresponding transfection kit when reagent the density of the cells grows to between 30%~50%. When cells were cultured until 100% confluence, a straight line was scratched with a sterile pipette tip and the cells originally in the area of the line were removed. The medium was then refreshed with the medium. Samples were taken at the time points of 0, 24 and 48 hours after scratching. Images were captured under a microscope at ×5 magnification.
Cell viability assay
Cell viability was evaluated by a CCK-8 kit (DOJINDO, Japen) according to the manufacturer’s protocol. The ratio of cell viability was calculated as Atreat/Acontrol× 100%.
Apoptosis assay
Cell apoptosis was determined with an AnnexinV-Alexa Fluor 647/Pi kit (FXP023, 4A Bio tech, Suzhou, China). Briefly, cells were seeded in 6-well plates (1×105/well). Transfection was carried out as described previously. After incubation with transfection kit for 48 ~ 72 h, cells were harvest and resuspended with 500 µl 1× binding buffer. 5 µL of Annexin V-FITC was added into the cells suspension, and 5 µL of PI was added to mix by gentle pipetting. The samples were analyzed with DxP Athena cell analyzer (Cytek Biosciences, USA). The data analysis was performed using Flow-Jo software.
Tumor microenvironment analysis
In order to quantify Immune cell infiltration, the R package "estimate" was first used to calculate the immunity score, stromal score and estimate score for each sample22. The scores of the various immune cells in each sample were then calculated using "TIMER"23,24. Finally, the R package "corrplot" was used to calculate the correlation between the various immunization scores and the risk scores.
Single-cell transcriptome data quality control and integration
For each single-cell sequencing sample first screen the cells according to the follow-ing criteria:1) unique molecular identifier (UMI) > 400; 2) 100 < genes detected per cell < 8000; 3) percentage of mitochondrial genes < 10%; and 4) complexity (logUMI (genes detected per cell)) > 0.8. Then the R package "DoubletFinder" was used to remove double cells. Finally, the R package "harmony" was used to remove the batch effect between samples to complete the data integration.
Dimensionality reduction and clustering
The integrated data were analyzed for dimensionality reduction and clustering analysis using the R package "Seurat". First, the Log-Normalize and ScaleData function are applied to the filtered and integrated data. The FindVariableFeature function was used to filter out the top 2000 highly variable genes for further analysis by principal component analysis (PCA); the percentage change between each principal component (PC) and subsequent components was calculated. If the percentage is less than 5%, the current PC number is selected for subsequent analysis. Then dimensionality reduction analysis is performed using the t-Distributed Stochastic Neighbor Embedding (tSNE) algorithms, and finally cluster analysis was analyzed using the FindClusters function. Cell types were annotated based on marker genes expression. The markers used were as follow-s: tumor cells (MLANA, MITF), immune cell (PTPRC), T cells (CD3D, CD3E), T cells subset: cytotoxic CD8 + T (CD8A, PRF1, GZMA, GZMK, NKG7), native T cells (IL7R), T regulatory cells (FOXP3, TNFRSF4, IKZF2, IL2RA), natural killer cells (GNLY, TRDC), myeloid cells(CSF1R), myeloid cells subset: macrophages(CD163),mitotic macrophages (MKI67), M2 macrophages (C1QA, C1QB, C1QC, IL10), M1 macrophages (C1QA, C1QB, C1QC,lackof M2 macrophage markers), monocytes (VCAN, FCN1, CD300E, S100A12, EREG, STXBP2, ASGR1) ,dendritic cells (CD1C), fibroblasts (MGP), microglial cells (FTL, APOC1), astrocyte (CCL2, GADD45B), Schwann cells (S100A1, CRYAB), neural progenitor cells (STMN1, UBE2C,TUBA1B), melanocytes (TYRP1,PMEL), Photoreceptor cells (RCVRN).
Afterwards, the risk score of each sample was calculated by the survival model described above, and the samples were categorized into high-risk and low- risk groups based on the risk score for downstream analyses.
Cellular communication analysis
In order to investigate the communication between various types of cells in the UM microenvironment, we used the R package "CellChat" for Cellular communication analysis. The Seurat objects were first divided into two groups, high risk and low risk, leaving the cell types that were present in both groups, then the Seurat objects were converted to CellChat objects, and finally cellular communication was analyzed for both groups using the default parameters. Use mergeCellChat function to integrate the data of the two groups.
Pseudotime analysis
We used a subset of cytotoxic CD8 + T cells from Seurat objects for the proposed time analysis. Cytotoxic CD8 + T cells were first analyzed by Dimensionality reduction and clustering according to the above methods. Use the FindAllmarker function to find the marker genes for each subset and annotate these subsets using these marker genes. Then the R package "monocle2" was used to perform the pseudotime analysis and calculate the pseudotime value for each cell. The starting point of differentiation is then inferred using the R package "CytoTRACE"25. Finally, the BEAM function was used to infer key regulatory genes during differentiation.