Research question
Does laparoscopic colectomy improve the 5-year overall survival rate compared to open colectomy in patients with stage I – III colon cancer?
Study setting, data source, and study population
A major step toward raising and improving the standard of care in oncological health services in Germany was the nationwide implementation of clinical cancer registries, enacted by the Cancer Early Detection and Registry Act (Krebsfrüherkennungs- und -registergesetz, KFRG) in April 2013 [29]. This emulated target trial study will use data from the Mecklenburg-Western Pomerania Cancer Registry, a registry providing ≥ 90 percent completeness of coverage [30]. Our study population consists of all adult patients with a confirmed diagnosis of primary stages I – III colon cancer between January 1, 2008 and December 31, 2018, who underwent major elective laparoscopic or open surgery (OPS codes 5-455, 5-456, 5-457, and 5-458) during this period [31]. Figure 1 below shows the present locations of the Registry's catchment areas, detailed by the postcode area.
Design of the target trial and the emulated trial
To answer our research question, this study will follow a two-step implementation process. First, we designed a hypothetical target trial protocol to compare the effect of the surgical treatment modalities on 5-year OS in patients with stages I – III colon cancer. Next, we will use real-world data from the Cancer Registry in Mecklenburg-Western Pomerania to explicitly and validly emulate the prespecified target trial. Table 1 summarizes the components of the specified hypothetical target trial protocol and the corresponding planned emulated trial using population-based cancer registry data. Hernán and Robins outlined seven key components that are essential for specifying target trial emulation studies [26]. It is, however, possible to add other components such as the research question and the aim of the study, as Martinuka and colleagues implemented in their COVID-19 study [33].
Table 1
Components of the hypothetical target trial and the emulated trial
Study component | Target trial | Emulated trial |
Research Question | Does laparoscopic colectomy improve the 5-year overall survival rate compared to open colectomy in patients with stage I – III colon cancer? | Same as target trial |
Aim | Compare the 5-year OS of patients with stage I –III colon cancer treated by laparoscopic colectomy or open colectomy. | Same as target trial |
Design | Phase III, multicenter, open-label, two-parallel-arms RCT | - |
Eligibility | Patients with solitary, primary stages I – III colon cancer aged ≥ 18 years at diagnosis, with performance score (Eastern Cooperative Oncology Group, ECOG ≤ 2 or Karnofsky score ≥ 50%), and Charlson comorbidity index (CCI ≤ 2). | Patients with solitary, primary stages I – III colon cancer aged ≥ 18 years at diagnosis, with performance score ECOG ≤ 2 or Karnofsky score ≥ 50%, and laparoscopic or open colectomy within 3 months (90 days) after diagnosis. CCI is not documented by Mecklenburg-Western Pomerania Cancer (MV) Registry center. |
Exclusions | Body mass index (BMI) > 35 Kg/m², advanced local disease (T4), adenocarcinomas of the transverse colon and rectum (including that of appendix, hepatic flexure, splenic flexure, overlapping sites and unspecified cancer of the colon), multiple primary colon tumors, prior history of cancer in the past 5 years. However, patients with non-melanoma skin cancer (NMSC) as well as in situ tumors (ICD-10 codes D00-D09) and benign tumors (ICD-10 a codes D10-D36) will not be excluded. The following patients will also be excluded: patients who underwent emergency colectomy, patients with gastrointestinal stromal tumor (GIST), neuroendocrine cancers, sarcoma cases, or patients with previous abdominal surgery. Additional excluded cases include robotic colectomies (OPS = 5-987). | Advanced local disease (T4), patients diagnosed with subtypes of colon cancer located on the appendix (C18.1), hepatic flexure (C18.3), transverse colon (C18.4), splenic flexure (C18.5), or overlapping sites of the colon (C18.8), and all unspecified sites (C18.9) – as well as cancer of the rectosigmoid junction (C19) will be excluded. Patients with prior history of cancer in the past 5 years (except NMSC, in situ tumors (ICD-10 codes D00-D09) and benign tumors (ICD-10 codes D10-D36)) will be excluded. In addition, patients with an additional GIST diagnosis, neuroendocrine cancers, sarcoma cases, patients with previous abdominal surgery for malignant indications as well as patients treated with emergency or robotic colectomy (OPS = 5-987) will also be excluded. Since data on BMI, previous surgery for non-malignant indications and CCI are not available in MV cancer registry, we will not use these covariates as exclusion criteria. |
Treatment strategies | 1) Laparoscopic surgery within 3 months of diagnosis 2) Open surgery within 3 months of diagnosis | Same as target trial |
Grace-period for treatment implementation | - | The first three months after confirmed stages I – III colon cancer diagnosis. |
Treatment assignment | Randomization, at baseline | Conditionally random on the levels of controlled baseline confounders at baseline and/or via cloning of patients in both arms. |
Outcome | Death from all causes over a follow-up period of 60 months from baseline (start of follow-up) | Same as target trial (5-year OS) |
Type of outcome | Time-to-event data | Same as target trial |
Follow-up | Follow up begins at diagnosis and the random allocation of patients to either treatment arms and then scheduled follow-up information according to German S3- guideline for colorectal cancer [34]. | Patients will be followed from time-zero (date of diagnosis) until death from any cause, last date of information or administrative censor date (31.12.2023), whichever will come first. |
Censoring | Loss to follow-up, administrative censor date (31.12.2023). | Same as target trial |
Causal contrast | Intention-to-treat effect, per-protocol | Observational analogue of the per-protocol effect |
Effect measure | Hazard ratio (HR), absolute mean survival time difference | Same as target trial |
Analysis plan | Multivariable Cox regression and restricted mean survival time (RMST) | Inverse probability weighted parametric survival (Royston-Parmar) model with variance obtained via M-estimation. Further to the HR, the absolute RMST difference will be calculated. |
Covariate adjustments | ϖ Socio-economic characteristics: Age at colon cancer diagnosis (in years) and sex. ϖ Clinical characteristics: UICC b stage (I, II or III), treating institutions, tumor laterality (left or right), treatment period (2008–2009, 2010–2014, 2015–2018), performance score (ECOG = 0, 1, or 2), hospital classification (certified colon center or not), local residual tumor margin status (yes, no, or not assessable), grade (low, intermediate or high), and number of harvested lymph nodes (< 21 or ≥ 21). | ϖ Socio-economic characteristics: same as target trial. ϖ Clinical characteristics: same as target trial. |
Sensitivity analysis | Excluding cases with missing values (complete case analysis, CCA) | Sensitivity analyses: - 1) repeat the same analysis on patients with complete case data - 2) including all advanced local disease (stage T4N0-2M0) cases - 3) quantitative bias analysis using the E-value method to evaluate the effect of potential residual confounders (BMI, comorbidity, previous abdominal surgery for non-malignant indications) |
Comparator analysis | | Comparator analyses: - Comparator 1) DAGs-guided naïve analysis of using fully adjusted cox proportional hazards model - Comparator 2) An Inverse-probability-weighted regression-adjustment (Doubly robust) analysis |
a ICD-10 = The International Classification of Diseases, 10th Revision; b UICC = Union for International Cancer Control |
The target trial design would include all adult patients (18 years or older) with solitary colon cancer, stages I – III, and electively treated by laparoscopic or open surgery. In addition, patients with a good performance status (Eastern Cooperative Oncology Group, ECOG ≤ 2 or Karnofsky performance status score ≥ 50%), and Charlson comorbidity index (CCI score ≤ 2) will be included. The target trial would exclude patients with an age less than 18 years, a body mass index (BMI) greater than 35 kg/m², prior abdominal surgery for non-malignant indications, treatment after 3 months of diagnosis, cancer of the transverse colon and rectum (within 15 cm of the anal verge on rigid sigmoidoscopy), cancer of the hepatic flexure of the colon, splenic flexure of the colon, overlapping sites of the colon, appendix and unspecified adenocarcinomas of the colon as identified by ICD-10 topographical subcodes, advanced local disease (T4) or adjacent organ invasion, multiple primary colon tumors, prior history of cancer except non-melanoma skin cancer and benign or in situ tumors. Patients with missing ICD-10 topographical subcodes of colon adenocarcinomas, who had an additional diagnosis of gastrointestinal stromal tumor (GIST), neuroendocrine cancers, or sarcomas, and patients who underwent emergency colectomy or robotic colectomy (OPS = 5-987) will also be excluded. In general, our inclusion and exclusion criteria are largely consistent with those of previous RCTs [2, 6, 10]. The emulated trial will use almost the same inclusion and exclusion criteria listed in the target trial design, with the exception of BMI, previous history of surgery for non-malignant indications, and CCI, which are not recorded in the cancer registry (see Table 1).
The target trial would use either open or laparoscopic surgery as its treatment strategies. The treatments we aspire to evaluate are similar to those of the target trial, but only open and laparoscopic surgeries performed within three months after first colon cancer diagnosis are defined as our treatment strategies. From the perspective of clinical practice, patients are unlikely to undergo surgery on the same day that their colon cancer diagnosis is confirmed. In this context, it is important to determine the time period in which treatment strategies can be implemented to reflect local clinical practice and minimize patient heterogeneity [35]. Since the initiation of surgery after diagnosis requires time, mainly related to diagnostic workup, a time frame (grace-period) must be taken into account in which most colon cancer patients are likely to undergo surgery.
Eligible participants will be randomly assigned to one of the two treatment modalities stratified by tumor stage and hospital and will be unblinded about the treatment they will be receiving in the target trial. In an open-label, multi-institutional, randomized, two-arm phase III trial, Kitano and colleagues randomly assigned patients with stage II – III colon cancer to either laparoscopic or open colectomy one day before surgery [9]. Similar to this RCT, our target trial study would randomize the patients one day before the day of surgery, but with a three-months grace-period to allow for successful implementation of the surgical treatment modalities. The incorporation of a grace period into a pragmatic trial design prevents ill-defined interventions and improves the use of observational data by expanding the pool of patients who can be included in emulated studies while helping capture real-world clinical practice [25, 26]. Accordingly, in our emulated study, patients will be classified into either the laparoscopic colectomy group or the open colectomy group for the treatment of colon cancer within the specified grace-period according to the surgical procedures performed as documented in the patients’ cancer registry records. In contrast to the randomized allocation of patients described in the hypothetical RCT mentioned above, the patient in the emulated trial will not be assigned randomly to one of the treatment modalities. Nevertheless, consistent with the assumption of conditional independence in real-world data, treatment assignment will be conditionally independent (as good as randomization) after adjusting for confounding variables at baseline, and within the defined grace-period.
Patients will be followed up from the date of assignment to either treatment modality until death from any cause (the outcome of interest), the last date of follow-up, or the administrative censor date (31.12.2023), whichever occurs first. This administrative censor date is chosen to ensure that patients are observed for at least five years. Patient follow-up will be made in accordance with the recommendation of the S3- German guideline for colorectal cancer [34]. In the emulated study, we will start following the patients at time-zero, which is defined as the time point at which the eligibility criteria will be met. The initiation of treatment will have to occur within the first three months after the diagnosis of colon cancer (defined grace period). Follow-up will end on the date of death, last date of follow-up, or 60 months after time-zero, or the administrative censor date Dec. 31, 2023, whichever will occur first. The main outcome will be 5-year OS, which refers to the duration from time-zero until death from any cause. The estimated hazard ratio will be complemented by the mean survival difference (at 1-year, 3-years, and 5-years) between the patients treated with laparoscopic and open surgery, respectively. Since our study covers a long period of time (2008 to 2018), treatment period-specific analyses will be performed to mitigate potential bias that may be induced by changes in clinical knowledge and practice over time [36].
Statistical analysis
Patient characteristics will be summarized using frequencies and percentages for categorical variables, whereas means and standard deviations or medians and interquartile ranges will be used for continuous variables, depending on their distributions.
We will use the clone-censor-weight method to emulate the hypothetical target trial [26]. Since each patient will be eligible for one of the two surgical treatment modalities during the three-month grace-period, we will first clone (copy) each patient record, and each clone will then be allowed to enter both treatment arms for the duration of the grace-period regardless of the surgical modality they will actually receive later. This will address the issue of confounding at baseline, as both treatment arms are identical in terms of baseline patient characteristics conditional on proper statistical adjustment. Cloning also offers the opportunity for even distribution of survival times (before surgery, for the entire grace-period or any early deaths that may occur before treatment) between the two surgical treatment modalities, allowing us to circumvent the problem of immortal time bias [26]. In the next step, we will censor patients over time if non-compliance with the initially assigned treatment is detected from that timepoint onward. That means, we plan to censor clones who will be assigned to laparoscopic colectomy (including converted colectomy) but whose records will indicate that they underwent open colectomy within the grace-period, or vice versa. The censoring times will be defined corresponding to the time of each censoring event and will be incorporated into the weighting model described below. Figure 2 provides all possible treatment paths and the respective censoring mechanism of the trial emulation for both treatment and outcome models.
The informative censoring (and hence associated selection bias) induced by the implementation of artificial censoring will be adjusted via inverse censoring probability weighting on the basis of propensity scores that will be estimated in a time-to-event type data. For each clone, weights that correspond to the reciprocal of the probability of receiving the actual treatment will be calculated. We will apply a logit function under an inverse probability-weighted parametric (Royston-Parmar) survival model (IPWPSM) with a closed-form variance estimator that will be predicted by M-estimation to derive treatment weights conditional on all baseline covariates [37]. This approach ensures comparability between the two treatment groups before estimating the treatment effect. The M-estimation method will be used to estimate an unbiased variance of the standard error of the outcome model [38]. This method correctly takes into account the estimation of the weights and is also a recommended alternative to the bootstrapping method for reproducibility and computational reasons [38]. The purpose of calculating the weights for the treatment model is to assign larger weights to patients who will be less likely to undergo laparoscopic colectomy, given the individual constellation of confounding factors. This will later allow us to determine an unbiased causal effect of laparoscopic versus open surgical treatment. We will then use absolute standardized differences within ≤ 10% as an indicator of adequate balance of baseline covariates (confounders) between the two treatment arms in both the original (unweighted) and weighted data at the end of the three-months grace-period. If there will be evidence of imbalance, the weights at the 1st and 99th percentiles will be trimmed to reduce the effect of extreme weights. This will be achieved by replacing all weight values that are below the 1st percentile with the 1st percentile value and by replacing all weight values above the 99th percentile with the 99th percentile value.
Intention-to-treat and per-protocol effects are used as causal contrasts for the target trial design. The intention-to-treat effect refers to the comparative effect of laparoscopic versus open colectomy, independent of subsequent non-compliance that may occur later after time-zero due to conversion from laparoscopic to open colectomy. The per-protocol effect, on the other hand, refers to the comparative effect of exclusive receipt of laparoscopic versus open colectomy. In contrast, the observational-analogue of the intention-to-treat effect will not be estimable in the emulated study because our emulation process will be implementing a grace-period with the cloning-censoring-weighing method. This means that every eligible patient will be assigned to both treatment arms during this period until they will be censored if their treatment modality will deviate from the modality to which they were initially classified. Therefore, our causal contrast of interest in the emulated study will only be the observational-analogue of the per-protocol effect of laparoscopic versus open colectomy performed in the first three months after colon cancer diagnosis.
We will fit IPWPSM with surgical treatment modality as the only predictor variable to model the outcome (all-cause mortality) and estimate the effect of the surgical treatment modality via the hazard ratio. When the event of interest (death from any cause) occurs, that event is considered to be an event solely in the patients’ originally planned treatment arm, in which each patient is initially consistent and remains uncensored at the point in time when the event occurs. On the other hand, the cloned records will be censored at that exact time the event occurs, which is accomplished by censoring indicators. Using post-IPWPSM estimation tools, OS will be estimated at 1, 3, and 5 years from baseline using the restricted mean survival time (RMST), and laparoscopic versus open colectomy effects will be calculated via absolute differences. Point estimates of OS will be accompanied by 95% confidence intervals. A weighted Kaplan-Meier method will be used to compare time-to-event curves over a 60-months period according to surgical treatment modality. We will compare our results with estimates from the literature. We make use of the CERBOT (Comparative Effectiveness Research Based on Observational Data to Emulate a Target Trial) tool (see Additional file 1) in our emulation process [39].
It is crucial to handle missing data properly to obtain more powerful models resulting in better estimates [40]. We will use multiple imputation via the chained equations method to impute missing values in covariates, assuming that observed missing values are missing at random [41]. In the literature, this method is also referred to as “multivariate imputation with sequential regression” and “imputation using fully conditional specifications” [42, 43].
In the web-based DAGitty environment (https://www.dagitty.net/), we will utilize directed acyclic graphs (DAGs) to determine the minimum necessary adjustment sets of confounders to control for within the comparative multivariable model. This approach mitigates the risk of bias that may arise from conditioning on intermediate covariates or colliders and mediators [44, 45]. The causal diagram for the comparative multivariable Cox model is complemented by a literature review and discussion to identify confounders, colliders, mediators and prognostic factors [46, 47]. By adjusting the minimum sufficient adjustment sets of confounders in our comparative model, we will be able to block any backdoor bias paths (remove relevant confounding) when estimating the total causal effect of surgical treatment modality on all-cause mortality.
We will consider the following baseline characteristics (measured before or at time-zero) as potential confounders: Age at colon cancer diagnosis, sex, year of surgical treatment (classified according to the implementation and update series of the German S3 guideline for colorectal cancer [34] as 2008–2009 [pre-implementation], 2010–2014 [first update], or 2015–2018 [second update]), tumor laterality (right-sided [caecum and ascending colon] tumor or left-sided [descending and sigmoid colon] tumor), hospital type (classified as either a registered colorectal cancer center or other group), performance status (ECOG ≤ 2 or Karnofsky performance status score ≥ 50%), histologic grade (low, intermediate or high), UICC stage (I, II or III), minimum number of harvested lymph nodes (classified as < 21 or ≥ 21) [48], and local residual tumor status within 3 months of diagnosis (no (R0), yes (R1 + R2), not assessable or missing). Additionally, we will extract the following dates accurate to the month (date of birth, diagnosis, surgery, death, neo/adjuvant therapy initiation, and conversion to open surgery) and other covariates (project specific pseudonym, tumor IDs, cause of death).
Sensitivity and comparator analyses
The cancer registry does not collect data on BMI, CCI, or history of previous surgery for non-malignant indications. Given that we will be unable to emulate these potential factors, residual confounding arising from these factors is a major limitation of this study. Thus, we will perform sensitivity analyses
1) of residual confounding using the E-value method [49] to provide a quantitative bias analysis for potential residual confounders (BMI, comorbidity, and previous abdominal surgery for non-malignant indications), and
2) after excluding patients with missing data to further evaluate the potential for selection bias that would otherwise have occurred if such cases were excluded. Further sensitivity analysis will be performed by including all patients with advanced local disease (stage T4N0-2M0).
In addition, a comparative analysis of the same but unemulated study population that started treatment at any point in time will be performed via classical Cox regression to assess the influence of our emulation process. As identified by the DAGs, the treatment effect, which adjusts for “age, grade, performance status, and tumor laterality” in this comparative model will be unbiased. Through this, we will practically examine whether the conclusion of the comparative analysis actually differs from that of the emulated study (see Additional file 2). Another comparative analysis of the same emulated population will be performed via the inverse probability weighted regression adjustment (doubly robust) method. This approach combines the outcome modeling strategy of regression adjustment and the treatment modeling strategy of inverse probability weighting [50]. It is referred to as “doubly robust” because although it requires running two models, of which only one—the treatment or outcome model—needs to be correctly specified to obtain an unbiased estimate of the treatment effect [50].
All our analyses will be performed in STATA (version 18.0, College Station, TX: StataCorp LLC), and statistically significant associations will be declared at a p-value of less than 0.05.