Model-based simulations
We compared the performance of different analyses in estimating the ratio of discontinuation hazards between treatments with varying dosing intervals - as observed for biologic and targeted-synthetic DMARDs (b/tsDMARDs) used in the treatment of axSpA and PsA - by means of model-based simulations. Hereby the analyses differed with respect to the event considered for discontinuation, “last dose received”, “decision to discontinue”, or “first dose missed”. Several properties of the simulation were informed by the real-world data (introduced in a later section).
Simulation set-up
We assumed either an exponential or a Weibull distribution with decreasing hazard for the time to the date it was decided to discontinue treatment. For the reference drug A, the values of the distributional parameters were guided by a probability of 80% to stay on treatment beyond twelve months and a median retention of eight years (for the Weibull distribution only) as roughly observed in the data used for the application example. For details on the derivations see Additional file 1. The distribution for the test drug B was assumed to be either the same as for the reference drug A or different to such an extent that it led to a proportional hazard ratio (HR) of B versus A equal to two. For each randomly drawn time to the discontinuation decision, the time to the last dose received and the time to the first dose missed were calculated based on the dosing interval chosen for the drug, as shown in the schematic in Fig. 1. For detailed calculations see Additional file 1. Data generation aligned with the causal structure shown in Figure S1, Additional file 1.
The follow-up period per patient from start of the treatment was set to three years leading to right-censoring of times as explained in the following section. The choice was guided by the average follow-up time of patients by their registries of 2.8 years as observed in the data used for the application.
For each of the 104 scenarios defined by the combination of the two underlying distributions (exponential and Weibull), sample sizes per drug (500 and 1000), true HRs (one and two), and the dosing intervals in days for each drug, as described in detail in Table 1, we ran 2000 simulation repetitions.
Table 1
List of simulation scenarios.
Distribution | Exponential or Weibull |
Sample size per drug | 500 or 1000 |
True HR B vs A | 1 | 2 |
Drug A interval (days) | 1 | 7 | 14 | 56 | 1 | 7 | 14 | 56 |
Drug B interval (days) | 1 | 1, 7 | 1, 7, 14 | 1, 7, 14, 56 | 1, 7, 14, 56 | 1, 7, 14, 56 | 1, 7, 14, 56 | 1, 7, 14, 56 |
All 104 vertical combinations of distribution, sample size per drug, true HR, drug A dosing interval, and drug B dosing interval were assessed. |
Estimand and analysis methods
The estimand of interest was the (natural log-transformed) ratio of the hazards for deciding to discontinue between the test drug B and the reference drug A.
For each simulation repetition, we performed the following analyses (denominated simulation methods, SM for short) and estimated the treatment HR from a Cox proportional hazards regression:
-
SM1: Time to discontinuation decision as an exactly observed event subject to right-censoring at the end of follow-up in case the decision to discontinue occurred thereafter.
-
SM2: Time to discontinuation decision as an interval-censored event, with the interval bounded by the time to the last dose received on the left and the time to the minimum of the date of the first dose missed and the end of follow-up on the right, subject to right-censoring at the end of follow-up in case the decision to discontinue occurred thereafter.
-
SM3: Time to last dose received as an exactly observed event subject to right-censoring at the end of follow-up in case the decision to discontinue occurred thereafter.
-
SM4: Time to first dose missed as an exactly observed event subject to right-censoring at the end of follow-up in case the decision to discontinue occurred after the last dose received prior to end of follow-up.
-
SM5: A mixture of time to discontinuation decision, time to last dose received, and time to first dose missed, all considered as exactly observed, as collected by ten notional registries. For this purpose, 50% of the observations per drug were attributed randomly in equal proportions to five notional registries collecting the date of the last dose received as the treatment stop, 30% to three notional registries collecting the date of the decision to discontinue, and 20% to two registries collecting the date of the first dose missed. This partitioning was guided by the application example.
All analyses included drug as covariate. In addition, the last analysis, SM5, was stratified by registry. The common reference was the first analysis, SM1, as it provides a consistent estimator of the estimand of interest in this case (20).
Performance assessment
For each analysis listed above we assessed its performance in estimating the true HR by means of bias, variance (empirical and model-based), coverage, and type I error / power of their treatment HR estimate as well as some therefrom derived measures such as the percent relative error of the model-based versus the empirical variance and, for comparisons of analyses, the percent precision gain of analyses compared to the reference analysis. For an overview and further explanation of the performance measures used please refer to the Additional file 1 as well as to reference (21) for further details. The initial choice of 2000 simulation repetitions seemed appropriate as assessed during the simulation work based on the Monte Carlo standard errors for the estimates of the performance measures (21). To accommodate the estimation uncertainty, we report the performance measures as estimates and two-sided 95% Wald-type confidence intervals (CIs).
Application: Retention of TNFi in axSpA and PsA
We used real-world data from patients diagnosed with either axSpA or PsA to assess the differences arising from analyses of tumour necrosis factor inhibitor (TNFi, a class of bDMARDs) retention using “last dose received”, “decision to discontinue”, or “first dose missed” to mark treatment discontinuation.
Study design and data provenance
A retrospective study based on data from European patient registries consolidated by the EuroSpA Research Collaboration Network (RCN), a registry-based initiative to collaboratively investigate observational data from axSpA and PsA patients throughout Europe (22). For the present study, we used data collected by nine registries (BIOBADASAR (Spain, ES), Biorx.si (Slovenia, SI), BSRBR-AS (United Kingdom, UK), ERS Biologic Therapy Register (Estonia, EE), ICEBIO (Iceland, IS), NOR-DMARD (Norway, NO), Reuma.pt (Portugal, PT), Romanian Registry of Rheumatic Diseases (Romania, RO), and SCQM (Switzerland, CH)) who confirmed to collect the date of the first dose as the treatment start date and either the date of the last dose received, the date it was decided to discontinue the treatment, or the date of the first dose missed as the treatment stop date. Of interest were b/tsDMARD-naive axSpA and PsA patients diagnosed at the age of 18 years or older who initiated a TNFi therapy between January 1st 2015 and June 30th 2021. Patients were followed up until the earlier of discontinuation of the TNFi treatment or last clinical visit recorded prior to the time of data extraction from the registry databases. The last registry’s data extraction occurred on April 1st 2022.
Outcomes and exposure of interest
The outcomes of interest were the time from treatment start to the date of the last dose received, the date of the first dose missed, or the date of the discontinuation decision.
For registries collecting the date of the last dose received or first dose missed the times to all three events were derived with the help of the dosing interval. The date of the decision to discontinue was interval-censored between the date of the last dose received and the minimum of the date of the first dose missed or the end of the patient’s follow-up. Any event, had it not or was it not known to have occurred until the end of the patient’s follow-up, was right-censored at that time.
For registries collecting the date of the discontinuation decision the time to that event was the only one available. It was observed exactly and right-censored at the end of the patient’s follow-up had it not occurred before.
The different TNFi treatments were the exposure of interest. The standard dosing intervals (after possible initial uptitration phases) were once every two weeks for adalimumab and certolizumab pegol, once every week for etanercept, once every four weeks for golimumab, and once every eight weeks for infliximab.
Exclusion from analysis
Patients who discontinued or possibly discontinued their TNFi treatment but had incomplete, missing, or conflicting information on the treatment stop date were excluded from the analyses.
For patients missing information on the dosing interval, the dosing interval was imputed by the most frequent interval among the enrolled patients from the same country, diagnosis, type of TNFi, and dose. Patients from registries collecting the date of the last dose received or the first dose missed for whom the dosing interval could not be imputed by this procedure were excluded from the analyses to keep the sample of patients the same for all compared analyses. Lastly, patients missing information on at least one of the patient characteristics assessed at start of treatment used as covariates in the analyses were also excluded.
We assumed conditional independence between the time to treatment discontinuation and missing information on dosing interval or covariates given the covariates in the model. Conditional independence ensures that an analysis restricted to the patients not excluded for missing information is unbiased. However, since patients who discontinued or possibly discontinued their treatment but had incomplete, missing, or conflicting information on the treatment stop date were excluded as well, the estimated HRs are likely biased and should not be interpreted apart from the intended comparison between analyses.
Statistical methods
Apart from adequate descriptions of the data, summaries by diagnosis and type of TNFi of patient characteristics at the start of treatment of all patients enrolled were derived.
Analogously to the simulation, we applied Cox proportional hazards regression models to estimate the ratios of discontinuation hazards for different treatments. The ratios for infliximab versus etanercept (largest difference in standard dosing intervals, 56 versus 7 days, respectively) and certolizumab pegol versus adalimumab (same standard dosing interval of 14 days) were of main interest. In addition, we compared the hazards of males and females for reference. In accordance with the simulation, we performed the following analyses (denominated application methods, AM for short, with numerals corresponding to the respective simulation methods) and compared their results with respect to the hazard ratios of interest.
Based on the registries collecting either the date of the last dose received or the date of the first dose missed as treatment stop date we analysed:
-
AM2: Time to discontinuation decision as an interval-censored event subject to right-censoring at the end of patients’ follow-up.
-
AM3: Time to last dose received as an exactly observed event subject to right-censoring at the end of patients’ follow-up.
-
AM4: Time to first dose missed as an exactly observed event subject to right-censoring at the end of patients’ follow-up.
-
AM5: A mixture of last dose received and first dose missed, all exactly observed. Registries provided information on either the time to last dose received or the time to first dose missed depending on what they collect as treatment stop date.
Based on all registries we analysed:
-
AM’2: Time to discontinuation decision as an interval-censored or exactly observed event (the latter applying to registries collecting the date of the discontinuation decision as treatment stop date) subject to right-censoring at the end of patients’ follow-up.
-
AM’5: A mixture of events, all considered as exactly observed. Registries provided information on either the time to the discontinuation decision, the time to last dose received, or the time to first dose missed depending on what they collect as treatment stop date.
No analyses were performed based on data from only the registries collecting the discontinuation decision. This explains the absence of an AM1 analysis.
The first analysis listed (AM2) was the reference analysis for the second (AM3), third (AM4), and fourth (AM5). The fifth analysis (AM’2) was the reference analysis for the last one (AM’5).
We performed the analyses separately for the two diagnoses axSpA and PsA and adjusted them for some variables considered to possibly affect treatment choice and treatment retention, such as country (by stratification), calendar year, concomitant csDMARD therapy, sex, and age. However, we assume that the intended comparison of analyses is not affected by the conditioning of the association of treatment and treatment retention.
Linear transformation models and software
For a short description of linear transformation models please refer to Additional file 1. For an introduction to transformation models please refer to references (23) and (24).
All analyses were performed in R (25), version 4.2.2, and the manuscript was written with Rmarkdown (26), version 2.20, within the RStudio IDE (27), server version 2022.02.2 + 485. Times to exactly observed events were analysed with Cox proportional hazards models using the function coxph from the R package survival (28), version 3.5.0, with the Efron approximation for tied event times. Times to interval-censored or mixed censored events were analysed with linear transformation models using the function Coxph from the R package tram (29), version 0.8.3, with the log_first argument set to true. While times to exactly observed events could be analysed with tram’s Coxph function, survival’s coxph was considerably faster and therefore chosen. Figures were mainly produced using the ggplot2 package (30), version 3.4.0.