This study was carried out in the Sleep Disorders Center, Electrophysiology Laboratory in Erzurum Regional Training and Research Hospital. Local ethics committee of the Atatürk University Medical School approved this study with an approval number of 06–29/28.05.2020. National Utility Model and Patent Project Number: 2023/12365. Date: 23.03.2023. The laboratory rooms were designed in accordance with the American Academy of Sleep Medicine (AASM) guidelines.
Study participants
The study was planned according to Helsinki Declaration. EEG signals of 72 volunteers, 51 males and 21 females (age; 51.7 ± 3.42 years and body mass index; 37.6 ± 4.21) diagnosed with sleep-disordered breathing by PSG were analyzed. The mean total sleep time of the volunteers during one night's sleep was 6 ± 2.37 hours.
Experimental recordings
All patients underwent a full-night laboratory PSG using the Grass Technologies PSG system (TWin 4.5.3, USA). During sleep, polysomnography consists of recording different physiological and pathophysiological parameters for period of 6 hours or longer throughout the night; these are reported after evaluations by a medical doctor. Electrophysiological signals recorded during sleep and wakefulness throughout the night are “Electroencephalogram (EEG), electromyogram (EMG; jaw, arm and leg), electrooculogram (EOG), electrocardiogram (ECG), snoring, oro-nasal airflow (lt/s), chest and abdominal movements (respiratory effort recordings), oxygen saturation, body position and real time-video-image recordings”. F4-M1, C4-M1, and O2-M1 and F3-M2, C3-M2, and O1-M2 sleep EEG channels allow for recordings from 6 different locations on the head with standard PSG [1, 3, 9, 33].
Experimental protocol
While sleep breathing disorders diagnoses of the volunteers were established according to AASM together with the physiology specialist (M.D.) responsible from the laboratory, the electrophysiological properties, number and density of sleep spindles in the EEG were examined and counted at least 3 times for each of the 6 channels with the naked eye (EEG monitoring with naked eye in PSG), (‘EEG in PSG’). EEG signal waves (sleep spindles) with fusiform morphology in each of six different EEG channels (F4-M1, C4-M1 and O2-M1, F3-M2, C3-M2, and O1-M2) lasting approximately 0.5-3 s at 11–16 Hz were analyzed by machine learning methods (SPINDILOMETER). The number and density of sleep spindles were compared between the classical method ('EEG in PSG') and the novel model (SPINDILOMETER).
Experimental setting
The idea of using machine learning methods to analyze sleep spindles and reveal clues about basic and clinical events related to thalamocortical activity in the brain is the main inspiration for the development of SPINDILOMETER. SPINDILOMETER contains units that analyze the frequency and amplitude values of EEG signals in PSG. SPINDILOMETER stores these values, uses them, and decides whether a sleep spindle exists (sigma frequency band (∼11–16 Hz)). Care was given to use the latest machine learning methods in developing this model. Table 1 describes the algorithm for the novel model: First, the missing data were normalized by replacing it with the mean value. Next, four different feature extraction algorithms were applied to this data: “Power Spectral Density, Continuous Wavelet Transform, Non-gaussianity Score and Bispectrum Score Features (Xd).” Then, appropriate features were determined by the feature selection process. In the final stage, the number and characteristics of sleep spindles were revealed by using classification algorithms.
Table 1
Algorithm for SPINDILOMETER.
Algorithm SPINDILOMETER |
1: EEG wave signals for every 10 seconds time slot in PSG 2: Start the collection of recordings from 6 EEG channels (Xn) 3: Missing data removal (\(\stackrel{-}{X}\)←Xn) 4: Normalize Xn 5: for all n 6: Calculated Power Spectral Density Features (Xa) 7: Calculated Continuous Wavelet Transform Features (Xb) 8: Calculated Non-gaussianity Score Features (Xc) 9: Calculated Bispectrum Score Features (Xd) 10: end for 11: Perform feature selection from the obtained features (XN ←Xa + Xb + Xc + Xd) 12: for all N 13: Classify spindle from data 14: If a spindle is found, increase the spindle number 15: end if 16: end for 17: Stop the collection of recordings from 6 EEG channels |
Data editing and analysis
A general technical approach to sleep spindle extraction methods in EEG signals:
On each of the channel signals were analyzed by using computer-based electrophysiological signal analysis methods. The process consisted of “Pre-processing, Feature extraction, Feature selection, and Classification”. We applied highly reliable signal analysis methods used in computer science to sleep medicine. For this reason, we used a wide spectrum of multiple (9 specifics) signal analysis methods for the analysis of EEG signal waves in PSG. We have explained all these methods and a sample PSG recording (Fig. 1.) from the study.
Pre-processing of EEG signals for SPINDILOMETER
Six-channel EEG wave signals were obtained from the electrophysiological signal recordings obtained from PSG, and were accepted as a data set. EEG wave signals were analyzed for each PSG epoch (30 s periods) during at least 6 hours of sleep for each volunteer. We tried to see the details by dividing each age into 10-second segments to better understand the EEG waves. Seventy percent of the data set was split into two as training and 30% as test phase. Sleep spindles detected by the researcher (M.D) who reported the PSG were estimated in the testing phase. The technical procedure in the data preprocessing phase was as follows: (1) Missing data were identified and filled with mean values. (2) Outlier data were identified and subjected to normalization. Minimum-Maximum Normalization: One of the most common methods used to reduce the differences between the data and to normalize the data. (a) the minimum value of the amplitude and frequency values of the EEG waves was set as '0' and the maximum value as '1', (b) all values between the minimum and maximum values were converted to 'decimal numbers.’
Feature methods used for SPINDILOMETER and their correlation with EEG wave signals
Power Spectral Density (PSD)
Frequency is a characteristic of EEG signals. PSD measures the power content of a signal versus frequency or the energy density of a signal at different frequencies. In the time domain, it is difficult to find distinctive features of EEG signals, but in the frequency domain, PSD finds similarities and differences as the maximum values are known [39]. Since PSD is the energy of the signal per frequency, it is defined as the Fourier transform of an EEG signal's autocorrelation function (A(ξ)). Its formula is as follows (ξ is the spatial shift, Ω is the number of waves) [40]:
$${\Phi }\left({\Omega }\right)=\frac{1}{2\pi }\underset{-\infty }{\overset{+\infty }{\int }}A\left(\xi \right){e}^{-i{\Omega }\xi }d\xi$$
Continuous Wavelet Transform (CWT)
EEG waves are chaotic bio-signals. CWT allows provisioning an over-complete signal representation by allowing the translation of wavelets and scale parameters to change continuously. Thus, it generates a large number of wavelet coefficients (via convolution). These coefficients can be used as features. The scaling of a wavelet is expressed as its compression or expansion. High scales characterize low-frequency behavior, and low scales characterize high-frequency behavior. Shifting a wavelet is expressed as accelerating or delaying it. Unlike the Fourier transforms, the time-frequency window of the CWT is adjustable. Its formula is as follows (b is the shift factor, α is the scale factor, t is the time, and f(t) is the signal vector of interest) [41]:
$$CWT=\frac{1}{\sqrt{\alpha }}{\int }_{-\infty }^{+\infty }f\left(t\right)\phi \left(\frac{t-b}{\alpha }\right)dt$$
Non-gaussianity Score (NGS):
This score was used to understand the distribution of EEG signal characteristics (amplitude, frequency). NGS indicates the non-Gaussianity of a given data segment. This method made it easy to measure the deviation of the EEG signals in each epoch of the PSG from the Gaussian model. The formula is as follows (p and q are the normal probability plots of the reference and analyzed data, respectively) [42]:
$$NGS=1-\left(\frac{{\sum }_{j=1}^{N}{\left({q}_{j}-p\right)}^{2}}{{\sum }_{j=1}^{N}{\left({q}_{j}-\stackrel{̿}{q}\right)}^{2}}\right)$$
Bispectrum Score (BGS)
The BGS, the 3rd order spectrum of the signal, is known as the bispectrum. Unlike the autocorrelation-based power spectrum (2nd order statistics), the bispectrum preserves Fourier phase information. The bispectrum can be estimated by estimating the 3rd -order cumulant and then taking a 2D-Fourier transform. It allows in-depth analysis of the EEG wave signals at each epoch in PSG. Its formula is as follows (ω1 = ω2 = ω; through the defined diagonal slice P(ω), the information available in the bispectrum is captured) [43]:
$$BGS=\frac{{\int }_{\omega 1}^{\omega 2}P\left(\omega \right)}{{\int }_{w3}^{w4}P\left(\omega \right)}$$
Feature selection for SPINDILOMETER
This process aims to simplify the number of attributes. It is the selection of the maximum efficiency model given all attributes in the dataset. Attribute selection focuses on finding an optimal subset of attributes (defining which attribute is more important). In our study, filtering methods were used in the attribute selection phase. The filtering method examines each attribute's susceptibility in the dataset to each classification [44–46].
Classifiers used for SPINDILOMETER and their relation with EEG wave signals
Classification
Classification is performed after the feature detection phase to recognize an unknown pattern. Classification is the process of determining to which class an unknown pattern belongs with the help of a classifier that uses the features of that pattern as input. This study classified EEG signals using the most appropriate machine learning methods to identify sleep spindles among the EEG signals during PSG [47, 48].
K–Nearest Neighborhood (KNN)
The difference of KNN is one of the most commonly preferred machine learning and classification methods for different applications. The simplicity of the method makes it our preference. KNN is an appropriate classification method for both binary and multiple classification scenarios. KNN is a learning algorithm and its aim is to make a classification on the existing learning data once a new sample is received [49]. In instance-based learning algorithms, the learning process is based on the data kept in the training set. The most important advantage of this method is the successful classification of multiple categorized data points (EEG wave signals). Our machine learning approach was as follows: In the KNN algorithm, we first defined k. We calculated the distance from k value to all learning samples and processing was performed based on the minimum distance. Lastly, we found to which class value the k value belonged. By using cross validation, we classified the data set into two groups: (1) Training data; was used by the classifier to identify model parameters. (2) Test data; was used to measure the success performance of trained classifiers. Data used for training and tests were continuously changed by k-fold cross validation method and 5 different correct classification ratios were found and their mean was obtained.
Support Vector Machine (SVM)
Support vector machines are a highly effective, simple machine-learning method for classification problems in data sets where the patterns between variables are unknown. It minimizes the classification error by selecting the line with the highest margin (necessary for discriminating sleep spindle wave signals). This method tries to find a plane or hyperplane that can optimally separate the classes. It is a binary classifier with high generalization ability. Therefore, SVM has facilitated the rapid and reliable extraction of the characteristic wave signals of the sleep spindle for the SPINDILOMETER model [50, 51].
Decision Trees (DT)
Decision trees are one of the supervised machine learning algorithms. It is a simple classification method that separates data according to its characteristics. It is called a tree because of its branching property. There are three tree terms in DT classification: nodes, branches, and leaf nodes: (1) nodes represent the test results of the feature used for the problem, (2) branches represent the result of the direct test, and (3) leaf nodes represent the class label [52]. The purpose of using the decision tree algorithm in this study is to learn decision rules extracted from the features of EEG wave signals and then develop a model that can predict the value of the target variable (sleep spindles).
Naive Bayes
The Naive Bayes algorithm helped us to classify the features of EEG wave signals as it is easy to apply and understand in large data sets: The conditional probability approach is the basic mechanism of this method. This method works by assuming that the presence of an attribute in a given class is not related to the presence of any other attribute. In this algorithm, which is based on Bayes' theorem, the available and already classified data are used [53]. Böylece yeni bir verinin mevcuttaki hangi sınıfa ait olduğuna ilişkin olasılık hesaplamasını yapmamızı sağlar. Thus, allowing us to make probability calculations as to which existing class a new data belongs. This method provided us with important data on the probability of discovering specific sleep spindle wave signals that we tried to find among the EEG wave signals from 6 channels, which is the dataset of our study.
Extra Tree Classifier (ETC)
This method, invented by Geurts in 2006, is an ensemble machine-learning algorithm that combines predictions from many decision trees. The ETC ensemble implementation consists of decision or regression trees that are not pruned according to their classical top-down procedural implementation. It differs from other tree-based ensemble methods for two main reasons: (1) it separates nodes by choosing breakpoints completely randomly, and (2) it uses all learning samples to grow the trees [54]. With the ETC ensemble implementation, the entire sample of EEG wave signals was used and classified.
Confusion matrix tables comparing sleep spindles numbers calculated by both methods
To prove that our algorithm correctly identifies the electrophysiological wave signal of each sleep spindle, we re-evaluated the EEG waves of all volunteers' PSG recordings by dividing them into 10-second time slices and creating confusion matrices for all patients. In machine-learning and statistical classification problems, the confusion matrix has been used to visualize the performance of the algorithm by providing a typical picture. In this study, two classes were created: The first class represents the 10-second time slots where sleep spikes are identified, and the second class represents the 10-second time slots where sleep spikes are not identified. Table 2 describes the properties of a matrix with a typical tabular layout. Table 3 shows the definitions for each numerical value in each box in the confusion matrix table comparing the number of sleep spindles calculated by both methods ('EEG in PSG' and SPINDILOMETER).
Table 2
Demonstrates real values calculated by actual- ‘EEG in PSG’; estimated values calculated by predicted-SPINDILOMETER.
Actual | Predicted |
| Positive | Negative |
Positive | True Positive (TP) | False Negative (FN) |
Negative | False Positive (FP) | True Negative (TN) |
Table 3
Definitions for numerical values in each box of the confusion matrix table comparing sleep spindles calculated by the two methods (‘EEG in PSG’ (actual) and SPINDILOMETER (predicted)) in time frames of 10 seconds.
TP | Total number of sleep spindles recorded by ‘EEG in PSG’ and also identified by SPINDILOMETER |
FP | Number of sleep spindles identified by SPINDILOMETER but not recorded ‘by ‘EEG in PSG’ |
TN | Number of sleep spindles where the sleep spindles are neither recorded by ‘EEG in PSG’ nor identified by SPINDILOMETER |
FN | Number of sleep spindles recorded by ‘EEG in PSG’ but not identified by SPINDILOMETER |
TP: True Positive FP: False Positive TN: True Negative FN: False Negative
We tried to understand the success of the SPINDILOMETER algorithm in identifying sleep spindles by means of performance measures for results obtained from each volunteer. Performance measures applied to SPINDILOMETER are the following; their definitions and formulas are listed in Table 4: (1) Accuracy is the most intuitive performance measure and it is simply a ratio of correctly predicted observation to the total observations. (2) Precision is the ratio of correctly predicted positive observations to the total predicted positive observations. (3) Sensitivity (Recall) - Sensitivity is the ratio of correctly predicted positive observations to all the observations in actual class. (4) F1 score - F1 Score is the weighted average of precision and sensitivity. Therefore, this score takes both false positives and false negatives into account. Intuitively it is not as easy to understand as accuracy, but F1 score is usually more useful than accuracy, especially if you have an uneven class distribution.
Table 4
Summarized formulas and definitions for performance indicators.
Formula | Definition |
\(\text{A}\text{c}\text{c}\text{u}\text{r}\text{a}\text{c}\text{y}= \frac{\text{T}\text{P}+\text{T}\text{N}}{\text{T}\text{P}+\text{T}\text{N}+\text{F}\text{P}+\text{F}\text{N}}\) | Accuracy; shows the success of the newly developed model in predicting all the classes defined by the classic method successfully (%). |
\(\text{S}\text{e}\text{n}\text{s}\text{i}\text{t}\text{i}\text{v}\text{i}\text{t}\text{y} \left(\text{R}\text{e}\text{c}\text{a}\text{l}\text{l}\right)= \frac{\text{T}\text{P}}{\text{T}\text{P}+\text{F}\text{N}}\) | Sensitivity (Recall); expresses the success of the newly developed model in predicting the positive classes defined by the classic method successfully (%). |
\(\text{P}\text{r}\text{e}\text{c}\text{i}\text{s}\text{i}\text{o}\text{n} = \frac{\text{T}\text{P}}{\text{T}\text{P}+\text{F}\text{P}}\) | Precision; how many of the positive classes defined by the classic method are correctly predicted by the newly developed model (%)? |
\(\text{S}\text{p}\text{e}\text{c}\text{i}\text{f}\text{i}\text{c}\text{i}\text{t}\text{y}= \frac{\text{T}\text{N}}{\text{T}\text{N}+\text{F}\text{P}}\) | Specificity; how many of the negative classes defined by the classic method are correctly predicted by the newly developed model (%)? |
\(\text{F}1-\text{S}\text{c}\text{o}\text{r}\text{e}= 2\text{*} \frac{precision*recall}{precision+recall}\) | F1 Score; is the weighted average of precision and sensitivity; therefore, it takes into account both the false positives and the false negatives. |
TP: True Positive, FP: False Positive, TN: True Negative, FN: False Negative.
Comparison of ‘EEG in PSG’ and SPINDILOMETER methods with ROC analysis
ROC analysis is a statistical method consisting of ROC curves and providing the following opportunities: (1) evaluates the proficiency of diagnostic tests and prediction models. (2) Identifies the power of discernment for the tests. (3) Compares the diagnostic performances of two or more diagnostic or laboratory tests. (4) Allows for the monitoring the quality of the laboratory tests. (5) Monitors the development of test performers and compares the efficiencies of different performers. The graph consisting of the sensitivity of a diagnostic test false positive rates (FPR, 1-specificity) is called a ROC curve. Each point on the curve consists of a sensitivity (TPR) and 1-specificity (FPR) value calculated by taking a different intersection point. These points are then combined with curve sections and intercepted by using an appropriate curve. By looking at the slopes of these curves, sensitivity and specificity values at different levels are compared. Thus, correct and wrong decisions are calculated for each level. Therefore, the most accurate sensitivity-specificity intercept point is reached. ROC analysis is especially helpful in demonstrating the reliability and performance of the diagnostic tests used in the field of medicine and it is basically a probability curve. The “Area Under Curve (AUC)” represents the degree of separation between the classes. As the area under the curve increases the performance to discriminate the classes from each other increases.
Statistical analysis
The number and density of sleep spindles were determined from the EEG recordings of the physician who examined the real-time PSG during sleep with the naked eye (classical method, 'EEG in PSG') and SPINDILOMETER and compared using the 'SPSS 22 for Windows program. The Intra-class Correlation Coefficient test was used to analyze the results. In this comparison, p < 0.001 was considered statistically significant. The accuracy of our algorithm was analyzed with confusion matrix tables. Finally, two methods ('EEG in PSG' and SPINDILOMETER) were compared by ROC analysis to prove that our algorithm can successfully identify each sleep spindle count in the shortest time interval (10 s).