1.1 VMD algorithm principle
VMD is a non-recursive variational mode decomposition method proposed by Dragomiretskiy and Zosso in 2014. Its core idea is to decompose the signal into a discrete number of band-limited intrinsic mode functions (BLIMF) with specific sparsity [18]. The VMD decomposition method decomposes complex vibration signals into intrinsic mode functions with physical meaning by constructing variational problems and iterative solutions [19]. Specific steps are as follows:
(1) Problem of Structural VariationThe challenge of structural variation entails the decomposition of the original input signal \(x\left(t\right)\) into K Intrinsic Mode Function (IMF) components, denoted as \(u\left(t\right)\). Subsequently, Hilbert transform is applied to demodulate each component u(t) to derive its envelope signal. The comparison is then made with the estimated center frequency \({\omega }_{k}\) mixing. This process adheres to the constraint that the sum of each \(u\left(t\right)\) component is equivalent to the original signal \(x\left(t\right)\). The structural variation problem is thus formulated as follows:
$$\left\{\begin{array}{c}\underset{\left\{{u}_{k}\right\},\left\{{\omega }_{k}\right\}}{{min}}\left\{\left.\sum _{k=1}^{K}{‖{\partial }_{t}\left[\left(\delta \left(t\right)+\frac{j}{\pi t}\right)*{u}_{k}\left(t\right)\right]{e}^{-j{\omega }_{k}t}‖}_{2}^{2}\right\}\right.\\ s.t. \sum _{k=1}^{K}{u}_{k}=x\left(t\right)\end{array} \left(1\right)\right.$$
Where, \({u}_{k}\) is each mode, \({\omega }_{k}\) is the center frequency of each mode, K is the number of decomposition layers, \({\partial }_{t}\) is the partial derivative of t, and \(\delta \left(t\right)\) is the impact function.
(2) Solve variational problems
The Lagrangian multiplication operator \(\lambda \left(t\right)\) and the quadratic penalty factor α are added to the constrained variation problem to transform it into an unconstrained variation problem, and its expression is:
$$L\left(\left\{{u}_{k}\right\},\left\{{\omega }_{k}\right\},\lambda \right)=\alpha \sum _{k}{‖{\partial }_{t}\left[\left(\delta \left(t\right)+\frac{j}{\pi t}\right)*{u}_{k}\left(t\right)\right]{e}^{-j{\omega }_{k}t}‖}_{2}^{2}+{‖x\left(t\right)-\sum _{k=1}^{K}{u}_{k}\left(t\right)‖}_{2}^{2}+⟨\lambda \left(t\right),x\left(t\right)-\sum _{k=1}^{K}{u}_{k}\left(t\right)⟩ \left(2\right)$$
The saddle point in Eq. (2), which is the optimal solution of Eq. (1), is obtained through the Alternating Direction Method of Multipliers (ADMM). The solution steps are:
Step 1: Initialize\(\left\{{\widehat{u}}_{k}^{1}\right\}\),\(\left\{{\widehat{\omega }}_{k}^{1}\right\}\), \({\widehat{\lambda }}^{1}\), n;
Step 2: Write only loop n = n + 1;
Step 3: For all \(\omega >0\), update \({\widehat{u}}_{k}\), \(k\in \left\{\text{1,2},\cdots ,K\right\}\);
$$\begin{array}{c}{\widehat{u}}_{k}^{n+1}\left(\omega \right)\leftarrow \frac{\widehat{x}\left(\omega \right)-\sum _{i<k}{\widehat{u}}_{i}^{n+1}\left(\omega \right) -\sum _{i<k}{\widehat{u}}_{i}^{n}\left(\omega \right)+\frac{{\widehat{\lambda }}^{n}\left(\omega \right)}{2}}{1+2\alpha {\left(\omega -{\omega }_{k}^{n}\right)}^{2}} \left(3\right)\end{array}$$
Step 4: Update \({\omega }_{k}\)
$$\begin{array}{c}{\omega }_{k}^{n+1}\leftarrow \frac{{\int }_{0}^{\infty }\omega {\left|{\widehat{u}}_{k}^{n+1}\left(\omega \right)\right|}^{2}d\omega }{{\int }_{0}^{\infty }{\left|{\widehat{u}}_{k}^{n+1}\left(\omega \right)\right|}^{2}d\omega },k\in \left\{1,K\right\} \left(4\right)\end{array}$$
Step 5: Update \(\lambda\)
$$\begin{array}{c}{\widehat{\lambda }}^{n+1}\left(\omega \right)\leftarrow {\widehat{\lambda }}^{n}\left(\omega \right)+\tau \left[\widehat{x}\left(\omega \right)-\sum _{k}{\widehat{u}}_{k}^{n+1}\left(\omega \right)\right] \left(5\right)\end{array}$$
Step 6: Repeat steps 2 to 5 until formula (6) is satisfied and the iteration is stopped, that is, K IMF components are obtained.
$$\begin{array}{c}\raisebox{1ex}{$\sum _{k}{‖{\widehat{u}}_{k}^{n+1}-{\widehat{u}}_{k}^{n}‖}_{2}^{2}$}\!\left/ \!\raisebox{-1ex}{${‖{\widehat{u}}_{k}^{n}‖}_{2}^{2}$}\right.<\epsilon \left(6\right)\end{array}$$
The primary merit of Variational Mode Decomposition (VMD) lies in its adaptive signal decomposition capability without necessitating prior knowledge [20]. The algorithm automatically estimates both the number (K) of modes and their respective center frequencies. Remarkably robust against noise, VMD yields Band-Limited Intrinsic Mode Functions (BLIMF) with compact support in the spectral domain, effectively capturing the inherent natural oscillation modes within the signal. Unlike the Empirical Mode Decomposition (EMD) method, VMD obviates the need for complex envelope estimation and extreme value tracking, rendering its calculations more straightforward [21].
In essence, the VMD algorithm tackles constrained variation problems through an iterative process grounded in Alternating Direction Multipliers (ADMM). Key operations within each iteration encompass updating the BLIMF via a Wiener filter with an offset frequency, adjusting the Lagrange multiplier, verifying convergence, and updating the estimated center frequency. Iterations persist until the residuals converge to a monotonic function. Ultimately, the signal undergoes decomposition into a series of BLIMFs and a residual.
1.2 Squirrel optimization algorithm model (SSA Model)
In the Variational Mode Decomposition (VMD) computation, the decomposition number (K) and the penalty factor (α) are pivotal parameters influencing the performance and effectiveness of the decomposition [22]. The value of K directly dictates the quantity of decomposed modal components, whereas the value of α affects their bandwidth, significantly impacting decomposition performance. Currently, there is a lack of objective criteria for setting these parameters. To address this, the present study introduces the Squirrel Search Algorithm (SSA) for the adaptive optimization of VMD parameters [23].
The Squirrel Search Algorithm (SSA) exhibits notable proficiency in search capability and accuracy when addressing complex problems within a search space. Inspired by the gliding behavior of squirrels, which enables them to evade predators despite their inability to fly, the SSA algorithm simulates this adaptive strategy. The algorithm's search process mimics squirrels' foraging behavior, wherein they locate food by moving between various trees. This method effectively facilitates exploration across different areas of a metaphorical forest, represented by the varying positions of the squirrels within the search space [24].
Assuming the number of squirrels is n, the position the squirrel moves is determined by a vector, and its position is randomly initialized within the boundary range.
$$\begin{array}{c}FS=\left[\begin{array}{cccc}F{S}_{\text{1,1}}& F{S}_{\text{1,2}}& \cdots & F{S}_{1,D}\\ F{S}_{\text{2,1}}& F{S}_{\text{2,2}}& \cdots & F{S}_{1,d}\\ ⋮& ⋮& ⋮& ⋮\\ F{S}_{n,1}& F{S}_{n,2}& \cdots & F{S}_{n,d}\end{array}\right] \left(7\right)\end{array}$$
\(F{S}_{n,d}\) is the value of the n-th mouse in the d-dimension, and the initial position of the squirrel in the forest is:
$$\begin{array}{c}F{S}_{i}=F{S}_{L}+U\left(\text{0,1}\right)\times \left(F{S}_{U}-F{S}_{L}\right) \left(8\right)\end{array}$$
\(F{S}_{U}\) and \(F{S}_{L}\) are the upper and lower bounds of squirrel movement, and \(U\left(\text{0,1}\right)\) is a random number [0,1].
The grade of food source is represented by the fitness of each squirrel location, and the fitness value is calculated and classified in ascending order. The position with the least fitness is: the best food source ① pecan, the next three are normal food source ② oak, and the other locations have no food source ③ common tree [25].
Depending on the probability of the presence of its natural enemies, \({P}_{dp}\) squirrels update the location of their movement.
Glide Path 1: ②→①
$$\begin{array}{c}F{S}_{at}^{t+1}\left\{\begin{array}{c}F{S}_{at}^{t}+{d}_{g}\times {G}_{c}\times \left(F{S}_{ht}^{t}-F{S}_{at}^{t}\right),{R}_{1}\ge {P}_{dp}\\ \text{R}\text{a}\text{n}\text{d}\text{o}\text{m}\text{l}\text{o}\text{c}\text{a}\text{t}\text{i}\text{o}\text{n},{ R}_{1}<{P}_{dp}\end{array}\right. \left(9\right)\end{array}$$
Glide Path 2:③→②
$$\begin{array}{c}F{S}_{nt}^{t+1}\left\{\begin{array}{c}F{S}_{nt}^{t}+{d}_{g}\times {G}_{c}\times \left(F{S}_{at}^{t}-F{S}_{nt}^{t}\right),{R}_{2}\ge {P}_{dp}\\ \text{R}\text{a}\text{n}\text{d}\text{o}\text{m}\text{l}\text{o}\text{c}\text{a}\text{t}\text{i}\text{o}\text{n},{ R}_{2}<{P}_{dp}\end{array}\right. \left(10\right)\end{array}$$
Glide Path 3:③→①
$$\begin{array}{c}F{S}_{nt}^{t+1}\left\{\begin{array}{c}F{S}_{nt}^{t}+{d}_{g}\times {G}_{c}\times \left(F{S}_{ht}^{t}-F{S}_{nt}^{t}\right),{R}_{3}\ge {P}_{dp}\\ \text{R}\text{a}\text{n}\text{d}\text{o}\text{m}\text{l}\text{o}\text{c}\text{a}\text{t}\text{i}\text{o}\text{n},{ R}_{1}<{P}_{dp}\end{array}\right. \left(11\right)\end{array}$$
\({d}_{g}\) is the random glide distance, \({R}_{1}{R}_{2}{R}_{3}\) is the random number in the range of [0,1], \(F{S}_{at}^{t}\) is the squirrel's position on the oak, \(F{S}_{ht}^{t}\) is the squirrel's position on the mountain tree, \(F{S}_{nt}^{t}\) is the squirrel's position on the common tree, and \({G}_{c}\) is the slip coefficient.
Seasonal changes affect squirrels' foraging activities, and seasonal changes are used to prevent algorithms from falling into local optimality.
$$\begin{array}{c}{S}_{c}^{t}=\sqrt{{\sum }_{z=1}^{3}{\sum }_{k=1}^{d}{\left(F{S}_{at,k}^{t,z}-F{S}_{ht,k}\right)}^{2}} \left(12\right)\end{array}$$
$$\begin{array}{c}{S}_{min}=\frac{10E-6}{{365}^{2.5t/{t}_{m}}} \left(13\right)\end{array}$$
\(t\) and \({t}_{m}\) are the current value and the maximum iteration value respectively, \({S}_{min}\) is the minimum value of the seasonal constant, \({S}_{c}^{t}\) is the seasonal constant, and the seasonal change condition is\({S}_{c}^{t}<{S}_{min}\). If this condition is met, the position of common squirrels changes randomly.
$$\begin{array}{c}F{S}_{nt,i}^{t+1}=F{S}_{i,L}+Levy\left(F{S}_{i,U}-F{S}_{i,L}\right) \left(14\right)\end{array}$$
\(F{S}_{i,U}\) and \(F{S}_{i,L}\) are the upper and lower bounds of squirrel movement, and Levy is the Levi distribution, effectively searching globally to find a new location that is optimal from the current location.
The process of optimizing VMD parameters based on squirrel search algorithm is as follows [26].
Step 1: Set SSA parameters, including population size, number of iterations and upper and lower bounds of the optimization range, and initialize the population position using Eq. (7).
Step 2: VMD decomposition of the power sequence was performed according to the position of each squirrel (K,a), and the fitness of each individual was calculated and sorted according to Eq. (8). The squirrels were assigned to hickory trees (optimal solution), oak trees (suboptimal solution), and ordinary trees (general solution) in order to preserve the locations of the most individual squirrels.
Step 3: Update individual squirrel locations.
Step 4: Update the seasonal constant, when S_c^t < S_min, move the individuals in the ordinary tree randomly.
Step 5: Repeat Step 2 for the newly generated location to update the optimal solution.
Step 6: Repeat steps 3 to 5 until the maximum number of iterations is reached to stop the optimization and output the optimal parameters and fitness values.
1.3 CNN algorithm principle
CNN is a feedforward neural network with a deep structure, including input layer, convolutional layer, pooling layer, fully connected layer and output layer [27].
The convolution layer extracts local features from the fault data, and the feature output of the convolution layer is obtained by activation function after convolution calculation. The feature expression is:
$$\begin{array}{c}{x}_{j}^{l}=f\left(\sum _{i\in {M}_{j}}{x}_{i}^{l-1}*{k}_{ij}^{l}+{b}_{j}^{i}\right) \left(15\right)\end{array}$$
Where: \({k}_{ij}^{l}\) is the convolution kernel weight matrix; \(f\) is a nonlinear activation function; \(l\) is the l-layer in the network; \({M}_{j}\) is the input feature map.
The pooling layer is divided into maximum pooling and average pooling, which can retain the main features while reducing the network parameters and calculation amount to avoid the occurrence of overfitting [28]. The fully connected layer further extracts the output features of the pooled layer and inputs them into the classifier for classification.