2.1. Data source and patient selection
The current retrospective cohort study relied on the SEER database (1975-2016), which samples 28% of the United States and publishes data of cancer incidence, treatment, and survival from population‐based cancer registries21. In the SEER database, we focused on patients with PCNSL from the SEER‐18 to conduct this analysis. The SEER‐18 registry includes Atlanta, Detroit, Greater California, Greater Georgia, Hawaii, Iowa, Kentucky, Los Angeles, New Mexico, New Jersey, Rural Georgia, states of Connecticut, San Francisco‐Oakland, Seattle‐Puget Sound, San Jose Monterey, the Alaska Native Tumor Registry, Louisiana, and Utah.
The SEER database classifies cancer histology and tomography by using the third edition of the International Classification of Disease-Oncology (ICD-O-3). PCNSL was defined by cancer diagnoses in anatomic locations of brain, spinal cord, leptomeninges, and other parts of the CNS (ICD-O-3 codes C70.0-C72.9). PCNSL patients with diffuse large B-cell histology were identified in SEER by filtering the databases based on histology codes (9680, diffuse large B-cell lymphoma [DLBCL], NOS; 9684, malignant lymphoma, large B, diffuse, immunoblastic; 9688, T-cell histiocyte-rich large B-cell lymphoma).
For this study, we included patients with primary diffuse large B-Cell lymphoma of central nervous system aged ⩾18 years and diagnosed between 1995 and 2016. A total of 5714 patients were extracted from the SEER database. Similar to previous studies,patients with “other infectious and parasitic diseases including HIV” as cause of death and follow-up were excluded to define a non-HIV PCNSL patient population (n=700) 22,23. Patients with more than one primary cancer were also excluded (n=1026). Patients without pathological diagnosis and patients diagnosed at autopsy were excluded (n = 228).
All methods were performed in accordance with the relevant guidelines and regulations. The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
2.2 Study variables
According to the primary treatment the patients received, information on primary treatment were divided into five classes: no anti-tumor therapy, surgery alone, radiotherapy alone(radiotherapy ± surgery), chemotherapy alone (chemotherapy±surgery)and combined chemoradiotherapy (chemoradiotherapy±surgery). According to the progress in treatment, the year of diagnosis was divided into three time periods. The time period-1 is from 1995 to 2002. The time period-2 is from 2003 to 2012, a period expected to reflect more intense chemotherapy regimens, the availability of rituximab and the utilization of autologous stem-cell transplantation as consolidation strategy. The time period-3 is from 2013 to 2016, a period expected to reflect novel agents spring up. Covariates including age at diagnosis, sex, race, marital status and distribution by site in the CNS were introduced, to adjust the hazard ratio (HR). Data of Survival months, survival status, the cause of death were also collected.
2.3 Statistical analysis
Statistical analysis was performed using SEER stat 8.3.8 and SPSS v22.0 for Windows (SPSS Inc., Chicago, IL). Chi-square tests were used to analyze Categorical variables. Kaplan-Meier survival curves were plotted for cause-specific survival (CSS), which was defined by specifying PCNSL as the cause of death, measured from time of diagnosis of PCNSL to time of death, in months. The survival difference was compared using log-rank test. Multivariate analysis using Cox regression model was performed to identify the independent risk factors for long-term survival. Two-sided P values less than 0.05 were considered to be significant.
2.4 Ethics statement
Ethical approval was waived by the Ethics Committee of Sun Yat-Sen University Cancer Center because preexisting data with no personal identifiers were used.