Mendelian is a UK-based health data analytics company focused on shortening the diagnostic odyssey of rare and hard-to-diagnose diseases. Mendelian has developed a digital case-finding tool, “MendelScan”, that can analyse structured clinical vocabulary, such as SNOMED CT codes [17] from primary care electronic health records (EHR) and highlight patterns of data that correspond to an increased likelihood of the patient being affected by certain RD. This enables the identification of those at risk and assists their clinician in accessing the correct diagnostic pathway. The MendelScan system is summarised in Figure 1.
The pilot study took place between January 2019 and October 2020. The primary objective was to assess the feasibility of applying MendelScan at a small scale in a primary care environment in the Lower Lea Valley (LLV) primary care GP Federation.
The process for delivering MendelScan into the selected primary care federation involved establishing agreements, deploying the algorithms into a pseudonymised data set, manually reviewing the EHR identified by the algorithm, delivering the reports to GP and collecting their feedback. Figure 2 summarises the implementation process.
2.1 Primary care EHR access
2.1.1 Ethics and information governance
To enable data access and confidence in this study an independent ethical analysis of this approach was commissioned [18]. Building on the outcome and recommendations of this report, and in compliance with information governance legislation, a data sharing agreement (DSA) was agreed between stakeholders. The DSA is a contract that stipulates the rules regarding usage and handling of data. Finally, a Data Protection Impact Assessment (DPIA) was drafted. identifying and minimising the data protection risks of the project [19].
2.1.2 Data Transfer
Data transfer involved creating a data set of patients’ EHR, removing personal identifiers (name and address) for these records and creating a pseudonym, a unique numeric identifier; Individuals who had opted out of sharing such data, through the national data opt-out, were removed. The pseudonymised records of all other patients were extracted and sent to Mendelian for analysis.
2.2 Algorithm deployment
Mendelian developed an approach to build algorithms for each RD in the following steps:
- Analysing the suitability of a RD using a scoring system based on features of the disease, the benefit of early diagnosis and the likelihood that relevant clinical characteristics would be captured in the primary care EHR.
- Performing a systematic literature review, searching for peer-reviewed screening or diagnostic criteria for the selected RD.
- Digitising the selected criteria into a numeric algorithm using structured data SNOMED CT codes, based on a combined scoring across several individual data points (Table 1). We did not interrogate data held in unstructured formats such as letters or free text in clinic notes.
The algorithms were deployed in the pseudonymised EHR data extracts. The MendelScan case-finding tool checked the algorithms against the data extract and highlighted all those that were above the suspicion threshold. MendelScan included seventy-six RD. (See Appendix 1)
The extracted EHR dataset consisted of structured data (SNOMED CT) codes. Each EHR was represented by different types of information codes given in Table 1.
2.3 Review of identified EHR
The review process is summarised in Figure 2.
We performed an anonymous, two‐round manual review process for each EHR identified by any of the seventy-six algorithms deployed. In round one, a medical doctor reviewed each EHR and assigned each case one of three outcomes: (1) rule-in, (2) rule-out (3) already diagnosed. In round two, rule-in cases were further reviewed by a GP, geneticist or an expert in a particular rare disease and further assigned a rule-in or rule-out outcome. For each rule-in case, a patient report was generated and sent to their GP practice.
2.4 Send reports to GP
The reports of each of the rule-in patients were returned to their GP by email. Each report included the unique patient identifier that enabled matching to the patient’s full EHR at the practice.
Each report included an explanation of the condition identified, the reasons for highlighting this disease for this patient, and the next steps for further evaluation after GP review.
2.5 GP feedback on reports
Feedback from GP was requested at two different stages. The first, patient report feedback, was requested as soon as the GP completed evaluating each patient’s report and full EHR. It consisted of an online questionnaire, accessed through a link on each patient report, with three multiple-choice questions and one open-ended question. (See Appendix 2)
The second, patient outcome feedback, was requested 3 months after the initial feedback and focused on all cases where the GP indicated that the report suggested a reasonable possible diagnosis and were advanced for further investigation in Figure 5.