Patients and data collection
This is a retrospective observational study using routine clinical data. Adult patients (≥ 18 years) were enrolled at 2 tertiary referral centers for epilepsy, Seoul National University Hospital and Kangbuk Samsung Hospital between 2014 and 2021. Inclusion criteria were as follows: (1) TLE (temporal lobe epilepsy) diagnosis based on seizure semiology, EEG, and magnetic resonance imaging; (2) monotherapy (1 ASM) during the first EEG recording; (3) unilateral epileptic focus. Demographic and clinical characteristics, including baseline and final seizure frequencies, were obtained through a retrospective review of medical records. A total of 48 patients with TLE were selected and divided into two groups according to the final outcome, regardless of the final ASM regimen: the responsive group (no seizures during the last 1 year of follow-up) and the refractory group (one or more seizures in the last 1 year of follow-up). (Fig. 1)
Statistical analysis
We used the mean (standard deviation) or frequency (proportion) for statistical analyses. Normality tests were performed using the Shapiro–Wilk test. The chi-square test was used to compare the distributions of sex, seizure type, epileptic focus, and interictal epileptic discharge on the first EEG between the groups. Fisher’s exact test was used to compare the distributions of the etiology of epilepsy, history of febrile convulsion, history of central nervous system infection, and ASM at the first EEG between the groups. Mann–Whitney U test was used to analyze the differences in age at epilepsy onset, age at EEG study, follow-up duration, and seizure frequency at the time of EEG study between the groups.
EEG recording
Resting-state EEG data were recorded using the NicoletOne® EEG system (Natus, San Carlo, CA, USA), in accordance with the international 10–20 electrode placement protocol, with a sampling frequency of 250 Hz, a hardware high-pass filter of 0.1 Hz, and a hardware low-pass filter of 500 Hz. To ensure optimal signal quality, the impedance of all electrodes was meticulously maintained below 10 kΩ. This study leveraged datasets from two separate organizations to foster a comprehensive analysis. To guarantee uniformity across datasets, only 19 channels (electrodes: Fp1, F7, T7, P7, F3, C3, P3, O1, Fp2, F8, T8, P8, F4, C4, P4, O2, Fz, Cz, and Pz) universally present in both organizations were incorporated into the analysis. EEG data without any stimulus recorded with eyes closed were utilized for this study.
Preprocessing
Based on the results of a previous study, a minimum data length of 2 min was deemed necessary to analyze significant epileptic seizure signals effectively. 20 Adhering to this guideline, several data windows were created from the individual patient data, each spanning 120 s with a 50% overlap. The increased dataset size helps mitigate overfitting that originates from small datasets, as the model is less likely to learn from the idiosyncrasies of a small dataset and more from generalizable patterns. Subsequently, the data were referenced from the average of the following EEG channels: F3, Fz, F4, C3, Cz, C4, P3, Pz, P4, O1, and O2.
To facilitate a nuanced analysis accounting for the initial site of a patient's epileptic seizures, a methodical strategy was employed to position the electrodes. For individuals with an epileptic focus on the left side, the existing EEG electrode placements were retained. Conversely, for those with an epileptic focus on the right side, the electrode positions were symmetrically adjusted. With this adjustment, the epileptic focus was positioned in the left hemisphere for each individual.
Prior to analysis, the signals underwent bandpass filtering across various frequency bands: delta (0.5−3 Hz), theta (3−8 Hz), alpha (8−12 Hz), low-beta (12−20 Hz), high-beta (20−30 Hz), and gamma (30−50 Hz), to segregate and highlight the relevant signal components for a more robust analysis.
Feature extraction
In the feature extraction phase, four time-domain features (Hjorth parameters, statistical measures, energy metrics, and zero-crossing rate) and two connectivity-based features (Pearson’s correlation and coherence) were used for the analysis, owing to their proven significance in EEG analyses. In addition, connectivity analysis was conducted using Pearson’s correlation and coherence analysis. To avoid duplication and to preserve analytical precision, connectivity values related to duplicated and symmetrically redundant information were omitted from the dataset.
- Hjorth parameters: Hjorth parameters have been used to detect and diagnose seizures, as well as predict seizure recurrence after ASM withdrawal. This set encompasses three components: activity, which indicates the signal power; mobility, representing the mean frequency; and complexity, reflecting changes in frequency. 2122
- Statistical measures: Statistical parameters have been employed as features to differentiate patients with epilepsy from healthy controls and predict the response to levetiracetam. 1723 Six prevalent statistical indicators were used as features: skewness, kurtosis, mean, median, minimum, and maximum values.
- Energy metrics: Energy metrics serve as markers for assessing brain activity. 24 Therefore, the linear and nonlinear energies of the EEG signals were included to offer insights into the energy patterns present within the signal. 25
- Zero-crossing rate: This parameter indicates the rate at which a signal transitions from positive to zero to negative or vice versa. It has been a prominent tool in numerous studies for distinguishing seizures from normal EEG signals. For this study, both the zero-crossing rate and its first derivative were incorporated into the analysis. 2627
- Interchannel Pearson’s correlation coefficient: Pearson’s correlation is a pivotal feature in brain analysis. It computes the linear relationship between two EEG channels and provides a measurement of both the strength and direction of the association between signal sets. This facilitates the identification of intricate patterns and potential anomalies within EEG signals. 2829
- Interchannel coherence: Coherence is a frequency-domain measure that offers insights into the synchrony between EEG channels in specific frequency bands. By evaluating the cross-spectral and auto-spectral densities, coherence facilitates the understanding of connectivity patterns and potential neural network alliances within EEG data. 3031
Feature selection
Robust feature selection techniques were utilized to improve the performance of the ML model and reduce the risk of overfitting. Two principal methods were employed: filter-based and wrapper-based feature selection. It is critical to highlight that the feature selection process was confined exclusively to the training set. During our 5-fold cross-validation procedure, we meticulously maintained a clear separation between the training and validation datasets. Feature selection was conducted exclusively using the training data. Subsequently, the performance metrics were evaluated solely based on the validation data for each fold.
- Filter-based feature selection is a technique that selects relevant features based on statistical properties. Three commonly used filter-based strategies (chi-square, ANOVA F-value, and mutual information) were employed. 32 Each of these methods was applied to assess the significance and contribution of individual features within our dataset.
- Wrapper-based method uses a search algorithm to evaluate different subsets of features and selects the optimal subset that achieves the best performance for a given ML model. Recursive feature elimination (RFE) was utilized as our wrapper method, systematically reducing the feature set to identify the most predictive features. 33
Evaluation
Three robust classifiers, random forest (RF) 34, extreme gradient boosting (XGB) 35, and light gradient boosting (LGB), were employed in this study. 36 The optimal feature selection method was determined based on the average area under the receiver operating characteristic curve (AUROC) ascertained during a five-fold cross-validation process. A comprehensive assessment of the model's performance was facilitated through the analysis of various metrics, including the AUROC, accuracy, F1 score, sensitivity, and specificity. Moreover, both positive and negative predictive values were meticulously scrutinized to gauge the proficiency of the model in accurately delineating the respective classes.
A 5-fold cross-validation was implemented at the patient level, rather than at the individual window level. By implementing cross-validation at the patient level, all data pertaining to a single patient, including their respective windows, are grouped together. This ensures that the model is tested on completely unseen patients, providing a more reliable and accurate assessment of its ability to generalize and its true predictive power. After identifying the superior model and feature selection method at the window level, an evaluation at the patient level was conducted using a soft voting mechanism (Supplementary Fig. S1), which is a critical method for aggregating probabilistic predictions across each individual patient's window, thereby ensuring more nuanced, reliable, and comprehensive insights into the model's predictive capabilities.
Feature interpretation and graph measurement
The selected channel pairs may vary during the five-fold cross-validation process, highlighting the importance of focusing on channel pairs that are consistently chosen in at least three of the five folds. The average feature importance and Shapley additive explanation (SHAP) values 37 for the chosen edges were systematically analyzed to understand their respective contributions to model predictions. Furthermore, a statistical comparative analysis was conducted between the responsive and refractory groups. A two-tailed paired t-test was employed to analyze each feature, both at the individual window levels and at the patient level (average window basis), with a significance threshold set at 0.05.
Given the prominence of coherence as a principal feature, the visualization results are depicted graphically. Each channel is represented as a node, and the coherence value is illustrated as an edge between the nodes. Graph visualization and analysis were performed using the NetworkX 38 and nilearn 39 Python libraries. To compare graph measurements, edges were connected in each window only if the coherence values were higher than 0.5. At the patient level, a single graph per patient was generated by averaging the values across all windows and subsequently connecting or disconnecting the edges based on a threshold of 0.5. Given the sensitivity of averaging to outliers, especially in cases with a limited number of windows, the analyses were restricted to patients with resting-state lengths exceeding 10 min, ensuring a minimum of 10 windows.