1. Data sources
The Surveillance, Epidemiology, and End Results (SEER) database collects information on cancer from registries sponsored by the US National Cancer Institution. In current, the SEER database consists of population-based cancer registries that cover 34.6% US population. The database collects data (e.g., patient demographics, primary tumor location, tumor morphology, diagnosis, and first course of treatment) and tracks the life status of patients.
The permission to access the database was obtained with reference number 14181-Nov2018. Our study was approved by the review board of the Jinhua Hospital of Zhejiang University School of Medicine. Patients with well and moderately differentiated A-NETs from 1988 to 2015 were identified by using the SEER Stat software. Patients diagnosed before 1988 were excluded because some variables were not collected on the database until 1998; patients diagnosed after 2015 were excluded to ensure adequate follow-up period.
2. Inclusion and Exclusion Criteria
Patients met the following criteria were included: age of 18–80 years old; underwent curative-intend surgery (surgery of primary site codes 30–80); tumor site ICD-O-3 code 18.1 (appendix); histology ICD-O-3 codes 8240 (carcinoid tumor, malignant), 8241 (enterochromaffin cell carcinoid), 8242 (enterochromaffin-like cell tumor, malignant), 8244 (composite carcinoid), 8245 (adenocarcinoid tumor), 8246 (neuroendocrine carcinoma), 8249 (atypical carcinoid tumor). Goblet cell carcinoids were not included because it is now recognized to have a minor neuroendocrine component, and its classification changes to goblet cell adenocarcinoma in the 2019 WHO classification of tumors of the digestive system[16].
Patients met the following criteria were excluded: patients with incomplete documentation, such as tumor size and LN status, were excluded; patients with multiple primary tumors were excluded to eliminate the survival impact from other tumors; patients with survival time less than 1 month were excluded because these patients are at risk of death of perioperative complications.
3. Data collection
The age and year at diagnosis, gender, race, histological type, differentiation, tumor size, depth of invasion, status of LN and distant metastasis, surgical procedure, cause of death and survival months were retrieved from the SEER database. Race was classified into white, black and other. Histological type was classified into pure and mixed, and the former included malignant carcinoid tumor, enterochromaffin cell carcinoid, neuroendocrine cell carcinoma and the latter included composite carcinoid, adenocarcinoid tumor, atypical carcinoid tumor. Differentiation was classified into well and moderate differentiation. As previous studies reported, depth of invasion was obtained by combining the data of the collaborative stage and the extent of disease, resulting in three categories: invasion of the lamina propria (LP), invasion or through the muscularis propria (MP/TMP), or invasion through the serosa and adjacent structures (TS).[17]. LN status was categorized into no lymph nodes examined (NLNE), lymph nodes examined with negative lymph nodes (LNN) and lymph nodes examined with positive lymph nodes (LNP). According to “RX Summ - Surg Prim Site” values in the database, surgical procedure was divided into two categories: right hemicolectomy/ more extended procedure (RHCM) and less extended than right hemicolectomy (LRHC).
4.Data analysis
Demographic and clinical characteristics of the cohort were reported as medians with interquartile ranges (IQR) for continuous variables, and frequency for categorical variables. Continuous variables were compared using Student t test or Mann-Whitney U test. Categorical variables were compared using chi-squared test or Fisher exact test.
The prognosis of patients with well and moderately differentiated A-NETs is favorable, and the incidence of non-cancer specific death (non-CSD) cannot be simply ignored when conducting a survival analysis[18]. Traditional Cox proportional hazard models only considered two statuses of outcome (e.g. alive and death), while competing risk models considered the presence of competing events (e.g. non-CSD). Thus, the competing risk model, rather than the Cox proportional hazard model, was applied in the study.
The time to cancer specific death (CSD) was calculated from the date of diagnosis to the date of death of cancer; the time to non-CSD was calculated from the date of diagnosis to the date of death of other causes. CSD was regarded as the outcome event, and non-CSD was regarded as the competing event. Univariate and multivariate analyses were conducted. In the univariate analyses, the cumulative incidences of CSD were calculated, and the differences were tested using the Gray tests. In the multivariate analyses, subdistribution hazard ratios (SHRs) were calculated to predict the association of variables with CSD (patients with tumor invasion of LP were excluded in both univariate and multivariate analyses because none of them dead of CSD)[19].
To reduce biases from confounders and achieve balance between the RHCM and LRHC groups, a propensity score matching was performed. Based on demographic and clinical characteristics (i.e., age, gender, race, histological type, differentiation, tumor size, depth of invasion, status of LN and distant metastasis), patients were matched with a 1:1 ratio using the nearest neighbor method (caliper set to 0.1)[20]. Absolute standardized differences were calculated to evaluate pre- and after-matched balance, and a “love plot” was plotted to present them[21]. The differences less than 10% support intergroup balance, and 0% is considered no bias.
In the after-matched patients, the univariate analysis between the RHCM and LRHC groups was conducted to identify whether RHCM rendered a survival benefit compared to LRHC. In addition, subgroup analyses were conducted in these patients to explore whether RHCM improved outcomes in a certain group, and a “forest plot” was plotted to present the results.
The difference was considered to be statistically significant when the 2-side P-value was less than 0.05. R software (version 3.6) was applied for data analysis, and R packages survival, survminer, forestplot and cobalt were used.