The growing availability of Electronic Health Records (EHRs) presents an opportunity to enhance patient care by uncovering hidden health risks and improving informed decisions through advanced deep learning methods. However, modeling EHR sequential data, denoted patient trajectories, is complex due to the evolving relationships between diagnoses and treatments over time, where medical conditions and interventions alter the likelihood of future health outcomes over time. While BERT-inspired models have shown promise in modeling EHR sequences by pretraining on the masked language modeling (MLM) objective, they struggle to fully capture the intricate, temporal dynamics of disease progression and medical interventions.
In this study, we introduce TOO-BERT, a novel adaptation that enhances MLM-pretrained transformers by explicitly incorporating temporal information from patient trajectories. TOO-BERT encourages the model to learn complex causal relationships between diagnoses and treatments using a new self-supervised learning task, the Temporal Order Objective (TOO). This is achieved through two proposed methods: Conditional Code Swapping (CCS) and Conditional Visit Swapping (CVS).
We evaluate TOO-BERT on two datasets, MIMIC-IV hospitalization records and the Malmö Diet cohort—comprising approximately 10 and 8 million medical codes, respectively. TOO-BERT demonstrates superior performance in predicting Heart Failure (HF), Alzheimer's Disease (AD), and Prolonged Length of Stay (PLS) compared to standard MLM-pretrained transformers, and notably excels in HF prediction even with limited fine-tuning data.
Our results underscore the effectiveness of integrating temporal ordering objectives into MLM-pretrained models, enabling deeper insights into the complex relationships in EHR data. Attention analysis further reveals TOO-BERT’s ability to capture and represent sophisticated structural patterns within patient trajectories.