In a first step, innovations reimbursed in German inpatient care between 2005 and 2017 were identified using a database annually published by the German Institute of the Hospital Remuneration System (InEK), and those with a relevant number of cases were pre-selected for consideration. Data on utilization numbers between 2005 and 2017 were drawn from the German DRG statistics [21] and used to plot diffusion curves and determine the number of cases treated with appropriate alternatives. The diffusion curves were grouped in seven progression types, and a minimum number of technologies per type was selected for investigation, also considering available evidence. The maximum sample for investigation was set at 30 technologies, to balance representativity with feasibility. The diffusion curves were subsequently juxtaposed to systematically identify randomized controlled trials (RCTs) over the observation period. Figure 1 visualizes the methodical approach in a flowchart. Each step is described in detail below. Since steps 1 to 5 serve as preparation for the analysis in step 6, their outcomes are described here and not separately in the results section.
(1) Determining the initial technology pool
To select new technologies for the study, the following steps were taken:
i) The lists of new medical diagnostic and treatment methods published annually by the InEK between 2005 and 2012 were scrutinized. These lists include all new medical diagnostic and treatment methods, which are significantly based on the application of a medical device for which hospitals have requested permission to negotiate innovation payments with health insurers (as they are not adequately reimbursed by current DRGs). The time window was chosen to ensure that data would be available for at least five years ahead of the start of this research. Consequently, the observation period for this study spans the years 2005 to 2017.
ii) The “DRG statistics” dataset, which includes hospital claims data reported annually to the German Federal Office of Statistics, was used to determine the number of hospitals using the technology and the total number of cases. DRG statistics capture anonymized information for all inpatient treatments.
iii) The following criteria were applied for selection based on the information from i) and ii):
-
Permission to negotiate an innovation payment for the technology was requested by more than ten hospitals and granted for at least one year between 2005 and 2012,
-
Case numbers and numbers of hospitals using the technology were available for at least four years,
-
100 cases or more were billed for at least one year.
Overall, 59 technologies were included following these selection criteria, and are listed in Table 1 (Table 1 also shows the final selection of 27 technologies based on the process described in step 4, below). They can be classified into ten groups according to the anatomical region of application. More than two-thirds of the technologies (41 of 59) concern the cardiovascular system.
(2) Grouping technologies empirically based on adoption curve progression
Based on the DRG case data described above, adoption curves were plotted for all 59 technologies (visualized in Additional file 1). We subsequently empirically developed seven curve progression types in order to select a varied sample for further analysis. The types evolved by grouping curve progressions that were as homogeneous as possible; this was achieved in two steps.
First, a qualitative clustering was performed aiming at an unambiguous assignment of the curve progressions. The gradients of the curves over time, changes in the gradient, and the approach of the curve to a saturation point were considered. The operationalization of these criteria and the resulting groups can be traced in Table 2 (types I-V). According to this approach, 28 of the 59 technologies could not be assigned to any of the five types. These were initially grouped together in a further group (VI) under the keyword "complex".
To obtain further differentiation of this residual group of curve progressions, a quantitative cluster analysis [22] was applied in a second step using the statistical software RStudio (version 1.3.1093). Details are presented in detail in Additional file 2. Based on the resulting target number of two clusters, the 28 technologies in curve progression type VI were distributed into two groups of 23 (type VI.a) and 5 technologies (type VI.b), respectively. Table 2 summarizes the types of progression curves, the operationalization of the criteria and the distribution of technologies to the types.
(3) Identifying baseline information on available scientific evidence
The aim of this step was to gather baseline information on the state of scientific knowledge for each technology to ensure that the selected sample covers different types of adoption curves, but also different results derived from the available evidence (i.e., evidence supporting utilization with or without restrictions, evidence not supporting utilization). For this purpose, we screened reports of the Medical Review Board of the Federal Association of Sickness Funds (Medizinischer Dienst des Spitzenverbandes Bund der Krankenkassen – MDS) for the 59 technologies described above. These reports are prepared upon request to evaluate technologies for which the negotiation of innovation payments has been permitted by the InEK.
Since the reports are confidential, detailed results cannot be reported. A total of 45 reports were identified for 56 of the 59 included technologies. We developed a structured template to extract information on methodology (e.g., year, PICOS criteria), included studies (e.g., level of evidence), reported outcomes (mortality, morbidity, quality of life), results of included studies, and the conclusions of the MDS appraisal. Based on the latter, we classified each technology into one of the evidence groups "has potential without limitations" (1), "has potential for certain patients" (2), and "no potential" (3). It is important to note that these initial assessments were not necessarily identical to the results of the systematic identification of evidence described under step 5, below.
(4) Selecting a balanced sample for further analysis
Of the 59 technologies pre-selected in step 1, we decided to further select a maximum of 30 technologies of all curve progression types and evidence classifications. Within these groups we chose those technologies that were most relevant to care based on high case numbers and high rates of increase in case numbers. The conjunction of these two parameters was translated into a multiplicative criterion for each technology i from all available data years j:
Within each type of curve progression, all technologies were sorted by size based on \(Crit{.}_{i}\) in descending order. To achieve a sample of 28 (as a multiple of the seven groups), the four top technologies were selected from each group (i.e., I-V, VI.a and VI.b). However, since curve types II, III, and V each contain fewer than four technologies, the final resulting sample included only 22. To further diversify the sample and balance the selection of evidence classifications, the remaining technologies were filtered according to the following scheme.
First, technologies that are used in the context of an indication already strongly represented in the sample of 22 were excluded. Then, three and two additional technologies were selected from the evidence groups "has potential without limitations" and "no potential", respectively, as these were the least represented. This results in an almost even distribution across the groups: 55 % (6 of 11) selected from group 1, 60 % (3 of 5) selected from group 2, and 57 % (4 of 7) selected from group 3. This sample also achieved a balance between groups with clear and unclear evidence (13 vs. 14 of 27). Again, selection was based on curve progressions: candidates were compared qualitatively and those that appeared to be particularly complementary to the technologies already selected were included. The final sample of 27 technologies is shown in the last column of Table 1.
(5) Systematically identifying published evidence for included technologies
Subsequently, published evidence on the selected technologies was systematically identified, selected, and evaluated. For this purpose, bibliographic biomedical electronic databases (Medline and Embase via OVID, PubMed, the Cochrane Library), clinical trial registries (Clinicaltrials.org, WHO International Clinical Trials Registry Platform (ICTRP)) and selected Health Technology Assessment (HTA) databases and agencies (LBI-HTA, IQWiG/G-BA, CRD HTA/ INAHTA Database, DIMDI-DAHTA, EUnetHTA) were searched between May and September 2019. Search strategies with high sensitivity were used and restriction on the study designs was included. However, only RCTs are considered for the purpose of this article; an overview of all evidence types is presented elsewhere [23].
The results of the searches were imported into the literature management program EndNote (version x9, Clarivate). Explicit inclusion and exclusion criteria were formulated for each technology; inter alia, studies were included if they were published in the 2-year period before the first documented hospital case through the end of the observation period (2017). Due to the number of included technologies and the high number of hits resulting from the sensitive searches, for the selection of relevant citations a so-called rapid review approach was adopted [24]. After duplicate removal, a random sample of 10 percent of all hits (at least 100) was drawn for each technology, and a title-abstract screening was carried out by two researchers independently. In case of discrepancies, the inclusion and exclusion criteria were discussed and adjusted, involving a third researcher if necessary. Subsequently, the remaining hits per technology were screened by one person (title/abstract screening followed by full-text screening as per standard systematic review methodology). Data from included publications were extracted using a standardized extraction sheet.
Based on the conclusion of the authors, each publication was labelled based on its key message:
-
Positive: The authors' conclusions are consistently positive regarding efficacy and safety, and across patient groups. When a neutral (e.g., "equally safe") and a positive (e.g., "effective") statement were combined, the publication was considered positive.
-
Negative: The authors' conclusions are consistently negative regarding efficacy and safety, and across patient groups. When a neutral (e.g., "safe") and a negative (e.g., "less efficacious") statement were combined, this publication was classified as negative.
-
Neutral: The authors conclude no difference between intervention and comparative intervention.
-
Inconclusive: The authors conclude that no definite statement can be made.
(6) Grouping and analyzing adoption processes according to diffusion rate and evidence
In a final step, the diffusion of innovations into the German health care system was examined against the background of available scientific evidence in order to identify successful and failed adoption statuses and possible changes therein. For this purpose, we adapted the grid design for classification of innovations from Denis et al. [12]. The expressions of available scientific evidence and utilization are combined in a 6-field table (see Table 3). In the case of positive scientific support, widespread utilization is described as "Success", whereas cautious utilization implies "Underadoption”. We extended the initial matrix by Denis et al. [12] to include the case of negative scientific support, where widespread utilization results in "Hazard" and restrained application represents "Vigilance". Limited or lacking evidence may indicate "Overadoption" or, on the contrary, "Prudence”.
The state of evidence is divided into (1) strong direct or moderate evidence with positive conclusions, (2) strong direct or moderate evidence with negative conclusions and (3) limited or no evidence. We categorized the evidence for each year and each technology using a modified version of the World Health Organization/Health Evidence Network criteria [25, 26] based on identified RCTs (see Table 4). Included RCTs were assessed regarding risk of bias according to the procedural rules of the G-BA [27]. A high potential for bias led to a downgrade within the grading scheme from strong to moderate evidence. The color coding for the different grades of evidence shown in Table 3 is further explained in Table 4.
The utilization, on the other axis of the classification grid in Table 3, is divided according to the diffusion rate in percent based on Roger’s Diffusion of Innovation Model [14]. Accordingly, we set a threshold at 16 percent of the target population for each technology. Below this threshold, adoption of the technology is considered "cautious”, or commensurate with the risk of a novel technology due to lack scientific support. Above the threshold, utilization of the technology was classified as widespread. It is important to note that Rogers’ model uses health care providers as the unit of analysis, with the first 16 percent corresponding to the group of innovators and early adopters, and beyond that to the (early) majority. We considered this threshold to be transferrable for the categorization of adoption based on case numbers, which offer an overarching view of technology diffusion; we note the lack of other models for such an exercise in the literature.
To determine the target population for each technology, we identified the predominant (gold) standard intervention for each indication in the literature. We subsequently accessed the “DRG statistics” dataset remotely via the Research Data Center of the German Federal Statistical Office (see Step 1 for information on the dataset). For six out of 27 technologies (PECLA/iLA, MVAC, BVS, DCB-AV, IABC and FDT) no definitive comparator or clear coding was available, so these were excluded from further analysis. For the other technologies, we cumulated the case numbers to transform the adoption curves (see Additional file 1) into diffusion curves (see Fig. 2). The cut-off value was formed for each year as 16 percent of the sum of the case numbers of the comparator and the technology itself.
For each of the remaining 21 technologies, adoption statuses were evaluated per year based on the grid in Table 3. In the subsequent analysis, we view the adoption of innovations as dynamic, allowing and accounting for change in status.