Background: Deep Learning (DL) models are able to produce accurate results in various areas. However, the medical field is specially sensitive, because every decision should be reliable and explained to the stakeholders. Thus, the high accuracy of DL models pose a great advantage, but the fact that they function as a black-box hinders their application to sensitive fields, given that they are not explainable per se. Hence, the application of explainability methods became important to provide explaination DL models in various problems. In this work, we trained different classifiers and generated explanation of their classification of electrocardiograms (ECG) by applying well-known methods. Finally, we extract quantifiable information to evaluate the explanation of our classifiers.
Methods: In this study two datasets were built consisting of image representation of ECG that were labelled given one specific heartbeat: 1. labelled given the last heartbeat and 2. labelled given the first heartbeat. DL models were trained with each dataset. Three different explainability methods were applied to the DL models to explain their classification. These methods produce attribution maps in which the intensity of the pixels are proportional to their importance to the classification task. Thus, we developed a metric to quantify the focus of the models in the region of interest (ROI) of the ECG representation.
Results: The developed classification models achieved accuracy scores of around 93.66% and 91.72% in the testing set. The explainability methods were successfully applied to these models. The quantification metric developed in this work demonstrated that, in most cases, the models did have a focus around the heartbeat of interest. The results ranged from around 8.8% in the worst case, until 32.4%, where the random focus would mean a value of approximately 10%.
Conclusions: The classification models performed accurately in the two datasets, however, even though their focus is higher in the ROI of the figures compared with the random case, the results allow the interpretation that other regions of the figures might also be important for classification. In the future, it should be investigated the importance of regions outside the ROI and also if specific waves of the ECG signal contribute to the classification.