This Delphi study achieved consensus during Round 2 that the domains measured in the 5-domain NPCCSS scale provided an accurate clinical understanding of NPC severity. If there was only one international scale recommended for use in routine clinical practice, the respondents would recommend use of the 5-domain NPCCSS scale. Although this statement achieved consensus in Round 2, amongst a panel of 16 NPC specialists who completed the first two rounds, it did not quite reach consensus in Round 3 from a panel of 19 experts.
In Round 1, respondents highlighted the 5-domain NPCCSS scale as simple, accurate and quick to administer and complete in a routine clinical examination and that its simplicity was valuable for multi-centre trials to support reproducibility and reliability across sites. Further, it was noted that the domains measured in the 5-domain scale are present in nearly all cases of NPC as the disease develops, unlike hearing loss and seizures, which are typically present in only a small percentage of patients. Respondents also noted that the domains measured in the 17-domain scale posed several challenges. For example, as a domain, memory is difficult to separate from the cognition domain and that measuring changes in the eye movement domain can be problematic.
However, the 5-domain scale was seen as insufficient for evaluation of specific subsets of patients, such as those with mainly psychiatric involvement or experiencing seizures. Moreover, answers in Round 1 stressed the importance of the granularity of scores and the comprehensiveness provided by the 17-domain NPCCSS scale, in capturing the progression of late-onset patients with a slowly progressing disease, as well as for measuring change and baseline assessment in clinical trials. This likely led to the 74% consensus in Question 2 of Round 3 that the 17-domain NPCCSS should be the first-choice severity scale in clinical trial settings.
Given these insights, the Core Working Group recommends that the 17-domain NPCCSS is used as the preferred scale to assess NPC severity across clinical trial enrolment and trial outcome measures. However, the domains listed in the 5-domain scale (ambulation, cognition, fine motor, speech and swallowing) should take precedence as primary endpoints as they are the most relevant to describe neurological disease progression and quality of life [16]. As supported by the experts in Round 1, use of the 5-domain NPCCSS is recommended in multi-centre trials to support reproducibility and reliability of results across multiple trial sites. Lastly, the Core Working Group recommends that the 5-domain NPCCSS scale is used within routine clinical practice to assess the clinical severity of NPC patients. These recommendations provide greater global consistency and optimisation of both the 17- and 5-domain NPCCSS scales, whilst not becoming too reductive, which was noted as important by respondents in Round 1.
The Core Working Group also recommends that resources or training on the NPCCSS scales (17- and 5-domains) should be developed and provided to clinicians working with NPC patients to optimise the standardisation of their application. Further, it is advised that this consensus paper should be reviewed every five years to ensure that the recommendations remain accurate.
This Delphi study gathered consensus on the use of six existing NPC clinical severity scales, the findings for which have enabled the research team to deduce several significant recommendations and areas for further development. Drawing on an international panel of NPC clinicians, who treat both paediatric and adult NPC patients, views were gathered from a select, yet representative panel of experienced experts in the field. However, the rarity of NPC disease means that there is a limited global community of NPC specialists. As a result, the size and composition of the expert panel may reduce the generalisability of the results. Nonetheless, the final sample size (16 participants in Round 1 and 2 and 19 participants in Round 3) was greater than the lower limit threshold of 12 [17]. Given the global scale upon which this field operates, the Delphi consensus method, which can be conducted quickly and online, was an appropriate tool for collecting responses. In addition to identifying the areas of consensus, the study highlighted areas where there is less certainty in the field, such as balancing the need for greater consistency of a single, global multi-domain scale with the concern of becoming too reductive.
While a strength of the study was its ability to access an international network of specialists in the field of NPC research and treatment, some of the participants included in the study were those who developed the clinical severity scales under evaluation. The strong opinions from these participants may therefore have introduced some response bias. Further, it is acknowledged that the concept of ‘consensus’ is fairly fluid. While we have consensus, there are still experts among the group who strongly disagree with the recommendations and hold these views firmly. Given the small size of the expert community, research is unlikely to ever to reach consensus across all statements. However, the fact that 19 out of 20 invited participants took part in the Delphi study highlights both the perceived importance of this piece of work to the NPC community, and the influential role that patient groups can have in bringing together stakeholders for such projects. According to guidance from the National Institute for Health Research (NIHR) Health Technology, the Delphi technique typically results in a 20% dropout rate over the three rounds of consensus development. In this study, there was an absence of dropouts in any of the three rounds, therefore substantiating the validity of our recommendations.
A key limitation of this study is that it does not offer definitive guidance, as consensus in Round 2 on the 5-domain NPCCSS as the preferred scale for routine clinical practice did not reach final consensus in Round 3. This may be a result of nuances in question phrasing, but the insights obtained were adequate to make several reliable recommendations. As a result, this consensus might facilitate a platform to enable standardisation of data capture and agreement on use for outcome measures.
We believe this study can help to inform and position future discussion around the use of the existing NPC clinical severity scales in clinical practice and trials. As more data, including genomic data, for NPC become available, the findings will become even more important and there may be a need to reconsider which parameters are most important and whether the preferred scales should be amended accordingly. Similarly, outcomes of ongoing trials of disease-modifying therapies for NPC will drive the need to identify the most appropriate clinical severity scale for determining drug efficacy.