Interpretable Hybrid Multichannel Deep Learning for 12-Lead ECG-based Heart Disease Classification
dc.contributor.advisor | Schwenker, Friedhelm (Prof.) | |
dc.contributor.advisor | Bisrat Derebssa (PhD) | |
dc.contributor.advisor | Taye Girma (PhD) | |
dc.contributor.author | Yehualashet Megersa | |
dc.date.accessioned | 2025-06-19T14:39:56Z | |
dc.date.available | 2025-06-19T14:39:56Z | |
dc.date.issued | 2025-02 | |
dc.description.abstract | The electrocardiogram (ECG) is a noninvasive and affordable tool that offers valuable insights into heart activity from multiple perspectives. However, medical practitioners often face difficulties in diagnosing underlying heart conditions from ECG signals. To address these challenges and improve diagnostic accuracy, researchers have investigated the potential of deep learning (DL) techniques. Nevertheless, developing a robust and interpretable deep learning model that performs well across diverse ECG datasets remains a key research focus. Thus, in this PhD research, an interpretable deep learning system is designed, incorporating preprocessing of ECG signal and post-hoc interpretability. The designed model is a multichannel hybrid deep learning architecture consisting of 12 blocks, each combining a one-dimensional (1D) convolutional neural network (CNN) with bidirectional long shortterm memory (BiLSTM) networks. After the 12 blocks, the feature maps are concatenated and further processed by an attention mechanism and a two-dimensional (2D) CNN. All components, including the 1D CNN, BiLSTM layers, attention mechanism, and 2D CNN, are used as feature extraction backbones. Subsequently, fully connected (FC) layers are incorporated for classification. The model was independently trained and tested on three distinct 12-lead ECG datasets: (1) the PTB-XL dataset, using five super-diagnostic classes, (2) the CODE-15% dataset, encompassing six heart disease classes, and (3) the Chapman Arrhythmia datasets, which were analyzed using two configurations: seven reduced classes (Chapman-Reduced) and four merged classes (Chapman-Merged). The model achieved average test accuracy rates of 89.84%, 97.82%, 98.55%, and 98.80% for these datasets, respectively. The result indicates the model’s effectiveness across different ECG datasets. To understand how the model reached its classification result, we applied two post-hoc interpretability techniques: Gradient-weighted Class Activation Mapping plus (Grad-CAM++) and SHapley Additive exPlanations (SHAP). These techniques were used to visualize influential segments of the ECG signal, both at the instance level for specific samples and at the test set level for assessing the overall contributions of individual ECG leads. SHAP, with its theoretical grounding, ensures consistent feature attribution by capturing causal relationships within the ECG data. Meanwhile, Grad-CAM++, through causal localization, identifies regions of the ECG signals that influenced the model’s decisions. The interpretability provided from both techniques were cross-checked against heart disease manifestations in ECG signals using established cardiology literature, ensuring alignment with clinical patterns. The model’s performance and the output interpretation techniques demonstrate that the proposed approach is a practicable tool for ECG-based heart disease diagnosis | |
dc.identifier.uri | https://etd.aau.edu.et/handle/123456789/5619 | |
dc.language.iso | en_US | |
dc.publisher | Addis Ababa University | |
dc.subject | Heart disease | |
dc.subject | 12-leads ECG | |
dc.subject | CNN-BiLSTM | |
dc.subject | deep learning | |
dc.subject | interpretability | |
dc.subject | Grad-CAM++ | |
dc.subject | SHAP. | |
dc.title | Interpretable Hybrid Multichannel Deep Learning for 12-Lead ECG-based Heart Disease Classification | |
dc.type | Dissertation |