7.wavelet Transform and Deep Learning-Based Obstruct
7.wavelet Transform and Deep Learning-Based Obstruct
Research Article
Keywords: Sleep apnea, ECG, Wavelet transform, Data balancing, Deep Learning
DOI: https://doi.org/10.21203/rs.3.rs-3028322/v1
License: This work is licensed under a Creative Commons Attribution 4.0 International License.
Read Full License
1 Wavelet transform and deep learning-based
2 obstructive sleep apnea detection from single-lead
3 ECG signals
4 Yuxing Lin1 , Hongyi Zhang1*, Wanqing Wu2 , Xingen Gao1*,
5 Fei Chao2 , Juqiang Lin1
6
1* School of Opto-Electronic and Communication Engineering, Xiamen
7 University of Technology, Xiamen, 361024, China.
8
2 School of Biomedical Engineering, Sun Yat-Sen University, Guangzhou,
9 510006, China.
10
3 School of Informatics, Xiamen University, Xiamen, 36100, China.
13 Abstract
14 Sleep apnea is a common sleep disorder. To address the characteristics of ECG
15 signals, we introduce a coordinate attention mechanism and propose an automatic
16 sleep apnea classification model (CA-EfficientNet) based on wavelet transform
17 and lightweight neural network. One-dimensional signals were converted into two-
18 dimensional images by wavelet transform and in put into the proposed model for
19 classification. The effects of input time window, wavelet transform type and data
20 balance on classification performance were considered. PhysioNet apnea ECG
21 database was used for training and evaluation. The 3-minute Frequency B-Spline
22 wavelets transform of ECG signal was carried out, and Dice Loss was used to
23 train the classification model of sleep breathing. The classification accuracy was
24 93.44%, precision was 93.2%, sensitivity was 88.9%, specificity was 96.2%, F1
25 score was 91%, and most indexes were better than other related work. Wavelet
26 transform and CA-EfficientNet model provide a feasible diagnostic method for
27 automatic classification of sleep apnea.
1
1 1 Introduction
2 Sleep apnea is a common sleep breathing disorder characterized by repeated pauses
3 and restarts in breathing during sleep, which includes three types of sleep apnea:
4 Obstructive Sleep Apnea (OSA), Central Sleep Apnea (CSA), and combined sleep
5 apnea [1, 2]. Sleep apnea can negatively impact a person’s health and cause a variety of
6 symptoms, such as excessive daytime sleepiness, snoring, and concentration difficulties.
7 It also has been associated with an increased risk of certain health conditions, including
8 high blood pressure, cardiac disease, and stroke. Thus, accurate sleep apnea detection
9 is crucial to diagnosis and therapy.
10 Polysomnography (PSG) is the “gold standard” for diagnosing sleep apnea dis-
11 orders, such as sleep apnea, narcolepsy, and restless leg syndrome, and is applicable
12 to individuals of all ages. PSG involves attaching multiple sensors to a patient’s
13 body during sleep and recording the electroencephalogram (EEG) signals, the blood
14 oxygen level, and the electrocardiogram (ECG) signals. It is a relatively costly and
15 time-consuming procedure that requires multiple specialized pieces of equipment and
16 trained personnel to set up and monitor the patient overnight. In addition, patients
17 may find it difficult or uncomfortable to sleep with sensors attached to their bod-
18 ies, which may impact the reliability of the results. Previous studies have shown that
19 ECG signals directly reflect the respiratory and circulatory systems of humans and
20 have a strong correlation with sleep apnea. Moridani et al. [3] showed that sleep apnea
21 can lead to irregular cardiac activity. Guilleminault et al. [4] demonstrated that sleep
22 apnea syndrome is associated with alterations in the R-R interval of the ECG signal
23 and hypoxia. ECG is a non-invasive, fast, and painless test that takes only a few min-
24 utes to perform and is completely painless. Moreover, ECG is a relatively inexpensive
25 diagnostic tool compared to other diagnostic tests. ECG signals have been used in
26 numerous works to classify sleep apnea. And these works are primarily divided into
27 traditional machine learning methods and deep learning methods.
28 Some works on automatic sleep apnea detection relied primarily on traditional
29 machine learning, which employed a predefined set of features to train different
30 classifiers. Bsoul et al. [5] developed a real-time method for detecting apnea and
31 hypoventilation episodes using 18 features extracted from R-R intervals in the time and
32 frequency domains. Hassan et al. [6] used statistical analysis for selecting features and
33 Bootstrap aggregating for classification. Song et al. [7] extracted time and frequency
34 features from ECG and respiratory signals and designed a classifier that combines
35 support vector machine (SVM) and a hidden Markov model that takes into account
36 the time dependence of apnea. Babaeizadeh et al. [8] performed heart rate variability
37 (HRV) analysis to detect heart rate change features and incorporated them into clas-
38 sification algorithms for apnea. Additionally, Achmad Rizal et al. [9] extracted HRV
39 features from ECG signals but used SVM for OSA classification. Zarei and Asl [10]
40 proposed a framework for the automatic detection of OSA in which alphabet entropy,
41 fuzzy/approximate, and sample entropy are calculated from ECG-derived respiration
42 (EDR) and HRV signals. Tripath [11] extracted features from the intrinsic band func-
43 tions of EDR and HRV signals and used a kernel limit learning machine for sleep
44 apnea classification. Fatimah et al. [12] used Fourier decomposition to transform ECG
45 signals into frequency bands and calculated their features, including mean absolute
2
1 deviation and entropy, to classify ECG segments using an SVM classifier and a Gaus-
2 sian kernel. Faal et al. [13] extracted ARIMA-EGARCH coefficients and used them
3 as a feature vector to classify apneic and normal ECG segments. Using only eight
4 features, the new ARIMA-EGARCH parameter-based method attained performance
5 comparable to other methods. The aforementioned works utilized feature engineering,
6 which requires a significant amount of trial and error to find the optimal set of fea-
7 tures that will improve the performance of a model. In addition, if the features are
8 not carefully selected, it can result in overfitting, where the model performs well on
9 the training data but poorly on the test data.
10 In recent years, deep learning methods have found wide use in sleep apnea detec-
11 tion. Praveen et al. [14] proposed an approach of cascading two different types of
12 Restricted Boltzmann Machine (RBM) in Deep Belief Networks (DBN) method for the
13 sleep apnea classification. Note that this method needs to extract HRV and EDR sig-
14 nals from a 1-minute segmented ECG signal. Zarei et al. [15] developed an automatic
15 feature extraction method by combining the Convolutional Neural Network (CNN)
16 and Long Short-Term Memory (LSTM) recurrent networks. Additionally, fully con-
17 nected layers are used to distinguish apnea events from normal segments. Haifa et
18 al. [16] also developed a hybrid model comprised of CNN and LSTM networks, and
19 used the PhysioNet Apnea-ECG database for training and evaluating. Jiang et al. [17]
20 used Mel-spectrogram to extract deep spectral features from CNNs-LSTMs-DNNs for
21 snoring determination, thus reflecting the severity of obstructive sleep apnea hypoven-
22 tilation syndrome in patients. Singh et al. [18] generated two-dimensional images from
23 single-lead ECG signals using continuous wavelet transform (CWT) and classified
24 them with CNN, achieving good accuracy and sensitivity in per-minute segment OSA
25 classification and demonstrating that extracting time-frequency features can improve
26 sleep apnea detection performance. Mashrur et al. [19] proposed a novel convolutional
27 neural network based on scale maps for apnea detection by converting single-lead
28 ECG signals to hybrid scale maps through CWT and empirical modal decomposi-
29 tion. Niroshana et al. [20] utilized CWT and short-term Fourier transform (STFT) to
30 fuse images and detect OSA. The fused images provided more information on apnea
31 events when fed to a deep CNN model. Ahmad Ayatollahi et al. [21] filtered 2-second
32 ECG segments and utilized the recursive graph (RP) algorithm to transform them
33 into 2D images with high precision. Liu et al. [22] combined the current and adjacent
34 1-min ECG segments to form the 3-min contextual input segment and incorporated
35 Transformer structures into CNN to improve the model’s performance.
36 In addition, the number of instances in each category (whether the patient has
37 sleep apnea or not) is not equal in the dataset on sleep apnea. Imbalanced data may
38 result in degraded performance, i.e., a classifier trained on an imbalanced dataset may
39 predict that each instance belongs to the majority class, leading to poor performance
40 on the minority class. To date, however, data imbanlance in sleep apnea has been
41 considered by just a few researchers. Shen et al. [23] combined weighted cross-entropy
42 loss function and hidden Markov model to effectively alleviate the problem of data
43 imbalance and improved the classification accuracy of the classifier. Sharan et al. [24]
44 used a single-lead ECG signal and a one-dimensional residual neural network combined
45 with a weighted cross-entropy loss for end-to-end sleep apnea detection.
3
(a) System Block Diagram
4
1 In conclusion, despite the advances in sleep apnea detection, there are still some
2 issues that must be resolved. Here are some remaining problems with sleep apnea
3 detection:
4 1. Previous research has demonstrated that CWT is useful for classifying sleep apnea;
5 however, the optimal mother wavelet for this task has not yet been determined.
6 2. Previous work has trained models with data of varying time durations; however,
7 the relationship between data length and classification accuracy has not been
8 studied.
9 3. The majority of sleep apnea studies ignore data imbalance and employ deep learn-
10 ing models directly; however, the optimal data balance approach for this task has
11 not been determined.
12 To resolve the aforementioned problems, we take into account the ECG data length,
13 wavelet transform type, data balance, and classification model. The following is a
14 summary of the main contributions of this work:
15 1. Different mother wavelets were taken to generate different lengths of time-
16 frequency maps, from which the most suitable mother wavelet and data length for
17 the sleep apnea classification task were identified.
18 2. Introduce CA attention mechanism to obtain temporal information and improve
19 the accuracy of EfficientNet.
20 3. Introduce a cost-sensitive algorithm to address the data imbalance problem
21 and compare the effects of different loss functions on the model generalization
22 performance.
23 2 Methodology
24 2.1 System overview
25 Using wavelet transform and CA-EfficientNet model, we developed in this work an
26 automatic sleep apnea classification model based on ECG signals. The schematic of
27 the procedure used to detect sleep apnea is shown in Figure 1. In the preprocessing
28 process, the raw ECG signal is segmented and denoised, and then the ECG signal is
29 transformed into 1-minute, 2-minute, and 3-minute time-frequency images by wavelet
30 transform. CA-EfficientNet is automatically trained to extract features from the time-
31 frequency maps and uses a coordinate attention mechanism to capture the temporal
32 correlation of sleep time data. In addition, a cost-sensitive algorithm (data balance
33 loss function) is introduced, which assigns weights based on the cost associated with
34 each error and is used to calculate the overall loss of the model. The goal of the cost-
35 sensitive loss function is to optimize the performance of the model by minimizing the
36 total cost of error classification, rather than simply minimizing the number of errors.
37 These components are explored further in the following sections.
5
1 property of multi-resolution analysis, which decomposes the signal into sub-signals of
2 different frequencies to make the local characteristics of the signal more obvious, and it
3 can be adapted to different types of signals by selecting different wavelet basis functions
4 (mother wavelets), which is particularly useful for extracting characteristic information
5 from non-stationary ECG signals. This paper therefore compares the effects of various
6 wavelet bases on the time-frequency analysis of ECG signals. This work utilizes the
7 following wavelet basis functions.
8 (1) Morlet wavelet (Morl) is a single frequency sinusoidal function under Gaussian
9 envelope, which can effectively avoid phase distortion in image processing [25]. The
10 mathematical expression is as follows:
2
2 ω0
− 41 iω0 t − t2 −
Ψ(t) = π e e −e 2 (1)
11 where ω0 is a positive real number that controls the center frequency of the wavelet.
12 (2) Complex Morlet wavelet (Cmor) is a complex continuous wavelet, which is
13 extended by Morlet wavelet and has good resolution in both time and frequency
14 domains [26]. Its mathematical form is:
2
2 ω0
−1/4 iω0 t − t2 −
Ψ(t) = π e e −e 2 (2)
t2
2 t2
ψ(t) = √ 1− 2 e− 2a2 (3)
3aπ 1/4 a
18 where a is a positive real number that controls the width of the wavelet.
19 (4) Shannon wavelet (Shan) is a complete orthogonal wavelet [28]. Since the Shan-
20 non wavelet is well-localized in both the time and frequency domains, it is ideally suited
21 for analysing signals with both temporal and spectral characteristics. Its mathematical
22 form is: √
2 πn n = 0
ψ[n] = √ (4)
2 sin 2 n ̸= 0
23 (5) Frequency B-Spline wavelet (Fbsp) is a wavelet based on B-sample function [29],
24 which has good smoothness and phase, and can handle signals such as Gaussian noise
25 very well. Its mathematical form is:
k
sin( πw
k )
1
Ψ(w) = √ πw , (5)
2 k
26 where k is a positive integer that controls the frequency resolution of the wavelet.
27 During data preprocessing, the ECG signal is converted to a 2D image using wavelet
6
1 transform and fed into a deep convolutional neural network for classification. In the
2 next section, the model used in this study is described.
Fig. 2: CA-EfficientNet.
7
Fig. 3: Coordinate attention structure diagram.
1 2.3.1 EfficientNet
2 EfficientNet [30] is a family of convolutional neural networks that achieves high accu-
3 racy in image classification tasks while being more computationally efficient than
4 comparable models. To optimize accuracy and efficiency, they are based on a method
5 of compound scaling that balances network depth, width, and resolution by composite
6 coefficients ϕ, which are denoted as
depth: d = αϕ
α · β2 · γ2 ≈ 2
ϕ
width: w = β s.t. (6)
α ⩾ 1, β ⩾ 1, γ ⩾ 1
resolution:r = γ ϕ
7 where α,β and γ are constants that can be determined by a small grid search. ϕ is
8 a composite factor that controls how many resources are available for model scaling,
9 while α,β and γ specify how these additional resources are allocated to network width,
10 depth, and resolution, respectively.
11 EfficientNet is stacked by Mobile Inverted Bottleneck Convolution (MBConv). In
12 order to achieve a good classification effect, EfficientNet-B0 is chosen as the base
13 network in this paper.
8
1 are given by 1
zch (h) = W
P
0≤i<W xc (h, i)
w 1
P
zc (w) = H 0≤j<H xc (j, w)
f = δ F zh , zw
1
(7)
g = σ Fh f h
h
gw = σ (Fw (f w ))
yc (i, j) = xc (i, j) × gch (i) × gcw (j)
2 The coordinate attention first encodes the input feature map of size C × H × W
3 using global average pooling (GAP) with a pooling kernel of size (H, 1) or (1, W )
4 along the horizontal and vertical coordinates, respectively, and decomposes the channel
5 attention into two one-dimensional feature encodings to generate feature maps of size
6 C × H × 1 and C × 1 × W , respectively. The location information is embedded in the
7 channel attention, and the concatenate operation is performed on zch (h) and zcw (w)
8 feature maps, followed by the F1 operation (using 1 × 1 convolution for dimensionality
9 reduction to obtain two tensors with the same number of channels), and then nonlinear
10 activation operation is performed to generate a feature map f of size C/r×1×(W +H).
11 Along the spatial dimension, the split operation is performed on f to generate the
12 feature maps f w of size C/r×1×W and f h of size C/r×1×H. Then f w and f h are are
13 both expanded to the same size as the input feature map by using 1 × 1 convolution.
14 Using the sigmoid activation function, the attention weight in two directions is then
15 determined. And finally the output of yc (i, j) is obtained.
24 where ti is the truth label and pi is the softmax probability for the ith class.
25 (2) Label smoothing cross entropy [32] is an extension of the standard cross-entropy
26 loss that tries to enhance the generalization performance of the model by reducing
27 overfitting and label noise sensitivity. It is defined as:
q ′ (k) = (1 − ϵ)q(k) + Kϵ
PK (9)
LLSCE = − k=1 log p(k)q ′ (k)
28 where ϵ is a weight factor, ϵ ∈ [0, 1], and k is the number of label categories.
9
1 (3) Focal loss [33] dynamic adjustment loss function is to dynamically adjust the
2 cross-entropy loss according to the confidence level, and to add a moderating factor
3 to reduce the weight of easy-to-classify samples and focus on the training of difficult
4 samples on the basis of the balanced cross-entropy loss function, which is given by:
γ
LF ocal (pt ) = −αt (1 − pt ) log (pt ) (10)
5 where pt denotes the probability of target label prediction, γ is modulating factor, and
6 α is a weight that handles the category imbalance.
7 (4) Dice loss [34] can also be utilised when the classes are unbalanced. It mea-
8 sures the dissimilarity between the predicted and ground truth labels using the Dice
9 similarity coefficient, which is defined as follows:
PN
2 i=1 yi ŷi
LDice (yi , ŷi ) = 1 − PN PN (11)
i=1 yi + i=1 ŷi
10 where yi and ŷi denote the labeled and predicted values of pixels, respectively, and N
11 is the total number of pixel points, which is equal to the number of pixels in a single
12 image multiplied by the batch size.
13 (5) Poly loss [35] allows the importance of different polynomial bases to be easily
14 adjusted depending on the targeting tasks and datasets. The key to Poly loss is the
15 decomposition of the loss function by a Taylor expansion into a series of weighted
16 polynomial bases, which is given by:
∞
j
X
LPoly = αj (1 − pt ) , (12)
j=1
17 where αj denotes the weight of polynomial loss, pt denotes the probability of target
18 label prediction, and γ is the modulation factor. Poly loss provides simple and effective
19 Poly-1 loss functions to improve the commonly used cross-entropy loss and focal loss,
20 respectively, which are given by:
LCE
Poly−1 = LCE + ϵ1 (1 − pt ), (13)
21
LFocal
Poly−1 = LFocal + ϵ1 (1 − pt )
γ+1
, (14)
22 where ϵ1 is the perturbation coefficient of the first polynomial term.
23 3 Results
24 In this section, we report the experiments and results used to evaluate the proposed
25 approach on the Apnea-ECG dataset. The experimental dataset, training setup, and
26 model evaluation metrics are first described, followed by the experimental results.
10
1 3.1 Apnea-ECG data set and pre-processing
2 PhysioNet Apnea-ECG database [36] served as the source for the dataset used in this
3 study. There are 70 records in the database, including a learning set of 35 records (a01
4 to a20, b01 to b05, and c01 to c10) and a test set of 35 records (x01 to x35). These
5 records ranged in duration from less than 7 hours to nearly 10 hours. Each record
6 contains a continuous single-lead ECG signal with a 100 Hz sampling rate.ECG signals
7 often contain various types of noise that can affect the accuracy of the signal analysis.
8 Therefore, we first splited the original continuous ECG signals into 1-minute ECG
9 signal segments and then filtered the noise using first-order Butterworth filtering and
10 p-smooth wavelets. Finally, three time-frequency images for 1, 2 and 3 minutes were
11 generated by wavelet transform. For the 2-minute ECG segments, the current 1-minute
12 segment is combined with the following 1-minute segment to create a complete sample.
13 For the 3-minute ECG segments, same as Liu et al.Liu et al. [22], the current segment
14 is combined with the adjacent (both previous and following) 1-minute segments to
15 form a complete sample.
TP + TN
Accuracy = (15)
TP + TN + FP + FN
29
TN
Specif icity = (16)
30
TN + FP
TP
P recision = (17)
31
TP + FP
TP
Recall/Sensitivity = (18)
32
TP + FN
Precision × Recall
F1 = 2 × (19)
Precision + Pecall
11
1 where T P , T N , F P , and F N denote the true positive, true negative, false positive,
2 and false negative, respectively.
3 We demonstrate the apnea detection performance under varying classification
4 thresholds using the Receiver Operating Characteristic (ROC) curve, which is a plot
5 of the true positive rate (TPR) versus the false positive rate (FPR) at different clas-
6 sification thresholds. In addition, the area under the ROC curve (AUC) is calculated
7 to evaluate the overall performance of the classifiers.
16 In this experiment, the same Apnea-ECG dataset and model were used. Figure
17 4 shows the effect and ROC curves of different mother wavelet apnea classifications
18 at one-minute, two-minute and three-minute time input windows, respectively, while
19 Table 1 compares the model performance using different wavelet basis functions to
20 generate temporal frequency maps of different durations.
21 Figures 4 in (a)-(f) show that the overall performance of the model trained on
22 the 3-minute segment data is the best, regardless of the wavelet basis function used.
12
1.0 Accuracy 1.0 Precision 1.0 Sensitivity
0.8 0.8
True Positive Rate
0.6 0.6
0.8
True Positive Rate
0.6
13
1 From the ROC curves shown in (g)-(h) in Figure 4 and Table 1, it can be seen that
2 the Shan and Cmor wavelet bases perform relatively poorly and the Fbsp wavelet
3 base performs best on the 1-minute, 2-minute, and 3-minute test sets. Specifically,
4 the Fbsp mother wavelet outperforms the other mother wavelets in four evaluation
5 indexes, including accuracy, sensitivity, F1 score and AUC value, under the 1-minute
6 and 2-minute input time windows, while when the Fbsp mother wavelet is trained on
7 the 3-minute segmentation data, the accuracy is 92. 51%, the accuracy is 94.8%, the
8 sensitivity is 84.5%, the specificity is 97.3%, and the F1 score was 89.4%, and the
9 AUC was 90.9%, which was significantly higher than the 1-minute and 2-minute time
10 windows.
11 The above results show that for the weakness of poor anti-interference and sus-
12 ceptibility to noise of ECG signal, Fbsp mother wavelet has good smoothness and
13 phase, better adaptability to noise, and superior generalization performance, which is
14 more suitable for sleep apnea classification task. In addition, the impact of time win-
15 dow input length on classification performance is more significant, and longer time
16 windows for input model training can capture more classification features.
27
14
1 Focal loss, Poly-1 focal loss, and Dice loss are loss functions optimized to reduce
2 the model’s focus on simple samples. As shown in Table 3, Poly-1 Focal Loss achieved
3 93% of accuracy, 94.7% of precision, 97.1% of specificity, 90.1% of F1 score, 91.6%
4 of AUC. They are 0.2%, 0.4%, 0.5%, 0.2% and 0.1% higher than the original model
5 respectively, which can slightly improve the effect of data imbalance. Dice loss accuracy
6 was 93.44%, sensitivity was 88.9%, F1 score was 91.0%, and AUC was 92.6%, which
7 were improved by 0.64%, 2.6%, 1.1% and 1.1% compared with the original model,
8 respectively. Compared with poly-1 focal loss, dice loss could better improve the effect
9 of class imbalance. It has higher model robustness.
SEN PRE
1.000
0.975
0.950
0.925
0.900
0.875
0.850
0.825
SPE ACC
F1 AUC
10 The focus loss is a function designed for unbalanced data sets, but it is intuitively
11 evident from the figure 5 that it is the least effective in detecting sleep apnea compared
15
1 to other loss functions because it is difficult to adjust the parameters in practice.
2 Label smoothing loss and multi-1 cross-entropy loss optimize the cross-entropy loss,
3 but their metrics are mostly lower than the original cross-entropy and are not suitable
4 for the classification task of sleep breathing disorders. Experimental results show that
5 the dice loss is the most suitable for the apnea classification task, and it can greatly
6 improve the effect of data imbalance.
7 To further verify the improvement effect of Diceloss loss on the classification per-
8 formance of CA-EfficientNet, a 5-fold cross-validation was used, and the box line plots
9 of each evaluation index are shown in Figure 8, which yielded an average accuracy of
10 92.87%, an average precision of 93.18%, an average sensitivity of 87.64%, an average
11 specificity of 96.18%, an average f1 score of 90.3%, and the average AUC was 91.94%.
12 It can be seen that after 5-fold cross-validation, the accuracy, F1 score and AUC
13 distribution are relatively concentrated, and the precision, sensitivity and specificity
14 are stable within a reasonable range, and the above results indicate that Diceloss is
15 effective and has good robustness.
0.890
0.968 0.910 0.926
0.934 0.9400
0.9375 0.885 0.966 0.924
0.933 0.908
Evaluation index
0.9350 0.964
0.880 0.922
0.932 0.9325 0.906
0.962
0.931 0.9300 0.875 0.920
0.960 0.904
0.9275
0.930 0.870 0.918
0.9250 0.958 0.902
0.929 0.916
0.9225 0.865 0.956
0.900
0.928 Accuracy 0.9200 Precision Sensitivity Specificty F1 Score AUC
16 4 Discussion
17 Previous research has demonstrated that classifying sleep apnea is facilitated by con-
18 verting one-dimensional signals to two-dimensional images using CWT. However, the
19 optimal wavelet basis function for this task has not yet been determined. In addition,
20 the relationship between data length and classification accuracy has not been thor-
21 oughly investigated. Moreover, the majority of studies on sleep apnea have neglected
22 to account for data imbalance.
23 In this study, the influence of different wavelet basis functions, input time windows,
24 and data imbalance loss functions was investigated. We propose an automatic classifi-
25 cation model for sleep apnea based on EfficientNet and the CA attention mechanism,
16
Table 4: Comparison of performances between the proposed method and other methods.
Study Input type Classifier ACC(%) SEN(%) SPE(%)
Varon et al. [37] QRS and HRV LS-SVM 83.8 79.5 88.4
Li et al. [38] RR interval DNN+HMM 84.7 88.9 82.1
Singh et al. [18] Time-frequency diagram CNN 86.22 90 83.8
Pombo et al. [39] HRV and EDR ANN 82.1 88.4 72.3
Shen et al. [23] RR interval CNN 89.4 89.8 89.1
Feng et al. [40] RR interval FSSAE+HMM 85.1 86.2 84.4
Haifa et al. [16] ECG CNN+LSTM 90.92 91.24 90.36
Bahrami et al. [41] RR interval and R-peak CNN+BiLSTM 88.1 81.5 92.3
Yeh etal. [42] Raw ECG CNN 85.8 80.1 89.4
Liu et al. [22] Raw ECG CNN+Transformer 88.2 78.5 94.1
Ours Time-frequency diagram CNN 93.44 88.9 96.2
1 which is trained using Dice loss and 3-minute time-frequency maps of ECG signals as
2 inputs.
3 On the basis of the same dataset, the efficacy of our model is compared to that
4 of other relevant works.As shown in Table 4, our proposed sleep apnea classifier out-
5 performs all other related efforts on most metrics. The accuracy was 93.44% and the
6 specificity was 96.2%. As shown, Figure 7.(a) shows the raw data, in which apnea sam-
7 ples are mixed with normal samples without clear distinction. Figure 7.(b) shows the
8 classification features extracted from the original effentnet model, in which a portion
9 of apnea samples were misclassified as normal samples. Figure 7.(c) visually shows
10 that Dice loss improves the classification performance of the CA-EfficientNet model
11 for sleep apnea samples and enhances the robustness of the model.
12 The proposed method reduces complex feature processing and computation, takes
13 full advantage of the automatic feature extraction and structural flexibility of neural
14 networks, and effectively alleviates the problem of sleep data imbalance.
15 There is still room for improvement in this work. Only the single-lead ECG sleep
16 apnea dataset was utilized without a more detailed classification of the type and
17
1 severity of sleep apnea. Therefore, future studies could be based on multimodal phys-
2 iological signals for apnea severity studies and combined with sleep staging to further
3 detect abnormal breathing events during sleep.
4 5 Conclusion
5 In this study, we propose an automatic sleep apnea classification model based on
6 CA-EfficientNet, which is trained using Dice loss and uses a 3-minute time-frequency
7 map of the ECG signal as input. Using a 3-minute input time window and the Fbsp
8 wavelet basis function, the proposed method achieves an accuracy of 93.44%. This
9 work takes into account the input time window, type of wavelet transform, and data
10 balance. Experimental results indicate that the Fbsp wavelet basis function is the
11 most appropriate for sleep apnea classification, that the best results are obtained for 3-
12 minute ECG signals, and that the use of Dice loss can address the imbalance problem
13 and significantly enhance the classification performance of the model.
20 References
21 [1] Khandoker, A.H., Palaniswami, M.: Modeling respiratory movement signals dur-
22 ing central and obstructive sleep apnea events using electrocardiogram. ANNALS
23 OF BIOMEDICAL ENGINEERING 39(2), 801–811 (2011) https://doi.org/10.
24 1007/s10439-010-0189-x
28 [3] Moridani, M.K., Heydar, M., Jabbari Behnam, S.S.: A reliable algorithm based
29 on combination of emg, ecg and eeg signals for sleep apnea detection : (a reli-
30 able algorithm for sleep apnea detection). In: 2019 5th Conference on Knowledge
31 Based Engineering and Innovation (KBEI), pp. 256–262 (2019). https://doi.org/
32 10.1109/KBEI.2019.8734992
33 [4] Guilleminault, C., Winkle, R., Connolly, S., Melvin, K., Tilkian, A.: Cyclical vari-
34 ation of the heart rate in sleep apnoea syndrome: Mechanisms, and usefulness of 24
35 h electrocardiography as a screening technique. The Lancet 323(8369), 126–131
18
1 (1984) https://doi.org/10.1016/S0140-6736(84)90062-X . Originally published as
2 Volume 1, Issue 8369
3 [5] Bsoul, M., Minn, H., Tamil, L.: Apnea medassist: Real-time sleep apnea monitor
4 using single-lead ecg. IEEE TRANSACTIONS ON INFORMATION TECHNOL-
5 OGY IN BIOMEDICINE 15(3), 416–427 (2011) https://doi.org/10.1109/TITB.
6 2010.2087386
7 [6] Hassan, A.R., Haque, M.A.: Computer-aided obstructive sleep apnea screening
8 from single-lead electrocardiogram using statistical and spectral features and
9 bootstrap aggregating. BIOCYBERNETICS AND BIOMEDICAL ENGINEER-
10 ING 36(1), 256–266 (2016) https://doi.org/10.1016/j.bbe.2015.11.003
11 [7] Song, C., Liu, K., Zhang, X., Chen, L., Xian, X.: An obstructive sleep apnea
12 detection approach using a discriminative hidden markov model from ecg signals.
13 IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING 63(7), 1532–1542
14 (2016) https://doi.org/10.1109/TBME.2015.2498199
15 [8] Babaeizadeh, S., White, D.P., Pittman, S.D., Zhou, S.H.: Automatic detection
16 and quantification of sleep apnea using heart rate variability. Journal of Electro-
17 cardiology 43(6), 535–541 (2010) https://doi.org/10.1016/j.jelectrocard.2010.07.
18 003
19 [9] Rizal, A., Siregar, F.D.A.A., Fauzi, H.T.: Obstructive sleep apnea (osa) classifica-
20 tion based on heart rate variability (hrv) on electrocardiogram (ecg) signal using
21 support vector machine (svm). TRAITEMENT DU SIGNAL 39(2), 469–474
22 (2022) https://doi.org/10.18280/ts.390208
23 [10] Zarei, A., Asl, B.M.: Automatic classification of apnea and normal subjects using
24 new features extracted from hrv and ecg-derived respiration signals. BIOMED-
25 ICAL SIGNAL PROCESSING AND CONTROL 59 (2020) https://doi.org/10.
26 1016/j.bspc.2020.101927
27 [11] Tripathy, R.K.: Application of intrinsic band function technique for automated
28 detection of sleep apnea using hrv and edr signals. BIOCYBERNETICS AND
29 BIOMEDICAL ENGINEERING 38(1), 136–144 (2018) https://doi.org/10.1016/
30 j.bbe.2017.11.003
31 [12] Fatimah, B., Singh, P., Singhal, A., Pachori, R.B.: Detection of apnea events
32 from ecg segments using fourier decomposition method. BIOMEDICAL SIGNAL
33 PROCESSING AND CONTROL 61 (2020) https://doi.org/10.1016/j.bspc.2020.
34 102005
35 [13] Faal, M., Almasganj, F.: Obstructive sleep apnea screening from unprocessed ecg
36 signals using statistical modelling. BIOMEDICAL SIGNAL PROCESSING AND
37 CONTROL 68 (2021) https://doi.org/10.1016/j.bspc.2021.102685
19
1 [14] Tyagi, P.K., Agrawal, D.: Automatic detection of sleep apnea from single-lead
2 ecg signal using enhanced-deep belief network model. BIOMEDICAL SIGNAL
3 PROCESSING AND CONTROL 80(2) (2023) https://doi.org/10.1016/j.bspc.
4 2022.104401
5 [15] Zarei, A., Beheshti, H., Asl, B.M.: Detection of sleep apnea using deep neural
6 networks and single-lead ecg signals. BIOMEDICAL SIGNAL PROCESSING
7 AND CONTROL 71(A) (2022) https://doi.org/10.1016/j.bspc.2021.103125
8 [16] Almutairi, H., Hassan, G.M., Datta, A.: Classification of obstructive sleep apnoea
9 from single-lead ecg signals using convolutional neural and long short term
10 memory networks. BIOMEDICAL SIGNAL PROCESSING AND CONTROL 69
11 (2021) https://doi.org/10.1016/j.bspc.2021.102906
12 [17] Jiang, Y., Peng, J., Zhang, X.: Automatic snoring sounds detection from
13 sleep sounds based on deep learning. PHYSICAL AND ENGINEERING
14 SCIENCES IN MEDICINE 43(2), 679–689 (2020) https://doi.org/10.1007/
15 s13246-020-00876-1
16 [18] Singh, S.A., Majumder, S.: A novel approach osa detection using single-lead
17 ecg scalogram based on deep neural network. JOURNAL OF MECHAN-
18 ICS IN MEDICINE AND BIOLOGY 19(4) (2019) https://doi.org/10.1142/
19 S021951941950026X
20 [19] Mashrur, F.R., Islam, M.S., Saha, D.K., Islam, S.M.R., Moni, M.A.: Scnn:
21 Scalogram-based convolutional neural network to detect obstructive sleep apnea
22 using single-lead electrocardiogram signals. COMPUTERS IN BIOLOGY AND
23 MEDICINE 134 (2021) https://doi.org/10.1016/j.compbiomed.2021.104532
24 [20] Niroshana, S.M.I., Zhu, X., Nakamura, K., Chen, W.: A fused-image-based
25 approach to detect obstructive sleep apnea using a single-lead ecg and a 2d con-
26 volutional neural network. PLOS ONE 16(4) (2021) https://doi.org/10.1371/
27 journal.pone.0250618
28 [21] Ayatollahi, A., Afrakhteh, S., Soltani, F., Saleh, E.: Sleep apnea detection from
29 ecg signal using deep cnn-based structures. EVOLVING SYSTEMS 14(2), 191–
30 206 (2023) https://doi.org/10.1007/s12530-022-09445-1
31 [22] Liu, H., Cui, S., Zhao, X., Cong, F.: Detection of obstructive sleep apnea from
32 single-channel ecg signals using a cnn-transformer architecture. BIOMEDICAL
33 SIGNAL PROCESSING AND CONTROL 82 (2023) https://doi.org/10.1016/j.
34 bspc.2023.104581
35 [23] Shen, Q., Qin, H., Wei, K., Liu, G.: Multiscale deep neural network for obstructive
36 sleep apnea detection using rr interval from single-lead ecg signal. IEEE Transac-
37 tions on Instrumentation and Measurement 70, 1–13 (2021) https://doi.org/10.
38 1109/TIM.2021.3062414
20
1 [24] Sharan, R.V., Berkovsky, S., Xiong, H., Coiera, E.: End-to-end sleep apnea detec-
2 tion using single-lead ecg signal and 1-d residual neural networks. JOURNAL
3 OF MEDICAL AND BIOLOGICAL ENGINEERING 41(5, SI), 758–766 (2021)
4 https://doi.org/10.1007/s40846-021-00646-8
5 [25] Morlet, J.: Wave propagation and sampling theory. Geophysics 47, 203–236
6 (1982)
7 [26] Flandrin, P., Rilling, G., Goncalves, P.: Empirical mode decomposition as a filter
8 bank. IEEE Signal Processing Letters 11(2), 112–114 (2004) https://doi.org/10.
9 1109/LSP.2003.821662
10 [27] Marr, D., Hildreth, E.: Theory of edge detection. Proceedings of the Royal Society
11 of London 207(1167), 187–217 (1980)
12 [28] Shannon, C.E.: Communication in the presence of noise. Proceedings of the IRE
13 86(1), 10–21 (1949)
14 [29] Unser, M., Aldroubi, A., Eden, M.: B-spline signal processing. part i. theory.
15 IEEE Transactions on Signal Processing 41(2), 821–833 (1993)
16 [30] Tan, M., Le, Q.V.: Efficientnet: Rethinking model scaling for convolutional neural
17 networks. CoRR abs/1905.11946 (2019) 1905.11946
18 [31] Hou, Q., Zhou, D., Feng, J.: Coordinate attention for efficient mobile network
19 design. CoRR abs/2103.02907 (2021) 2103.02907
20 [32] Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the
21 inception architecture for computer vision. CoRR abs/1512.00567 (2015)
22 [33] Lin, T., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object
23 detection. CoRR abs/1708.02002 (2017)
24 [34] Li, X., Sun, X., Meng, Y., Liang, J., Wu, F., Li, J.: Dice loss for data-imbalanced
25 NLP tasks. CoRR abs/1911.02855 (2019)
26 [35] Leng, Z., Tan, M., Liu, C., Cubuk, E.D., Shi, X., Cheng, S., Anguelov, D.:
27 PolyLoss: A Polynomial Expansion Perspective of Classification Loss Functions
28 (2022)
29 [36] Penzel, T., Moody, G.B., Mark, R.G., Goldberger, A.L., Peter, J.H.: The apnea-
30 ecg database. In: Computers in Cardiology 2000. Vol.27 (Cat. 00CH37163), pp.
31 255–258 (2000). https://doi.org/10.1109/CIC.2000.898505
32 [37] Varon, C., Caicedo, A., Testelmans, D., Buyse, B., Van Huffel, S.: A novel algo-
33 rithm for the automatic detection of sleep apnea from single-lead ecg. IEEE
34 Transactions on Biomedical Engineering 62(9), 2269–2278 (2015) https://doi.
35 org/10.1109/TBME.2015.2422378
21
1 [38] Li, K., Pan, W., Li, Y., Jiang, Q., Liu, G.: A method to detect sleep apnea based
2 on deep neural network and hidden markov model using single-lead ecg signal.
3 NEUROCOMPUTING 294, 94–101 (2018) https://doi.org/10.1016/j.neucom.
4 2018.03.011
5 [39] Pombo, N., Silva, B.M.C., Pinho, A.M., Garcia, N.: Classifier precision analysis
6 for sleep apnea detection using ecg signals. IEEE Access 8, 200477–200485 (2020)
7 https://doi.org/10.1109/ACCESS.2020.3036024
8 [40] Feng, K., Qin, H., Wu, S., Pan, W., Liu, G.: A sleep apnea detection method
9 based on unsupervised feature learning and single-lead electrocardiogram. IEEE
10 Transactions on Instrumentation and Measurement 70, 1–12 (2021) https://doi.
11 org/10.1109/TIM.2020.3017246
12 [41] Bahrami, M., Forouzanfar, M.: Sleep apnea detection from single-lead ecg: A
13 comprehensive analysis of machine learning and deep learning algorithms. IEEE
14 Transactions on Instrumentation and Measurement 71, 1–11 (2022) https://doi.
15 org/10.1109/TIM.2022.3151947
16 [42] Yeh, C.-Y., Chang, H.-Y., Hu, J.-Y., Lin, C.-C.: Contribution of different sub-
17 bands of ecg in sleep apnea detection evaluated using filter bank decomposition
18 and a convolutional neural network. SENSORS 22(2) (2022) https://doi.org/10.
19 3390/s22020510
22