Broken Tooth Gear Fault Detection Using Vibration Signals Based on Convolutional Neural Network

doi:10.21203/rs.3.rs-4949296/v1

Download PDF

Research Article

Broken Tooth Gear Fault Detection Using Vibration Signals Based on Convolutional Neural Network

https://doi.org/10.21203/rs.3.rs-4949296/v1

This work is licensed under a CC BY 4.0 License

You are reading this latest preprint version

Gear faults are a major concern in industrial settings, leading to performance degradation and potential system failures. This paper explores the use of Convolutional Neural Networks (CNNs) for broken tooth fault detection in gear systems. Traditional methods for fault detection rely on manual feature extraction from vibration signals, which can be time-consuming and may not capture all relevant information. CNNs, on the other hand, can automatically learn complex patterns from data, making them well-suited for this task. In this paper we develop a computationally tractable deep learning (DL) based CNN model that can be used for broken tooth fault diagnosis in various industrial settings, irrespective of the type of gearbox or gears being used. The authors further compare the performance of developed CNN model with traditional signal processing techniques and Support Vector Machine (SVM)-based classification. The CNN model achieved superior accuracy (98.6%) in distinguishing between broken and healthy teeth across various operating conditions for one experimental setup. For a second setup with a less severe broken tooth fault, the accuracy was 93%. In contrast, SVM models achieved a maximum accuracy of 90.7% using manually selected features. These findings underscore the superiority of the proposed deep learning (DL) based CNN model for broken teeth gear fault detection. Moreover, it exhibits greater resilience to fluctuations in operating conditions and fault types compared to conventional techniques. Comparisons with established deep learning models such as VGG16, AlexNet etc. demonstrate that the proposed model surpasses all others in terms of classification accuracy.

Gear Fault

Convolutional Neural Networks (CNN)

Support Vector Machine (SVM)

Wavelet Denoising

Broken Teeth Gear Fault Detection

Gear systems play a pivotal role in diverse industrial applications, spanning from machinery within manufacturing plants to propulsion systems in vehicles. The dependable performance of these systems is paramount for ensuring operational efficiency and safety (Loganathan et al. 2016). Nonetheless, gears are inherently susceptible to wear, defects and other types of faults over time, which may result in performance degradation and potential system failures (Goswami and Nandan Rai 2023). In recent years, there has been a burgeoning interest in utilising advanced technologies for condition monitoring and fault detection in gear systems. Among these technologies, Convolutional Neural Networks (CNNs) have emerged as potent tools for automated feature extraction and pattern recognition within intricate data sets (Liu et al. 2018; Hu et al. 2023; Vashishtha et al. 2023; Patil et al. 2024). CNNs, as a class of deep learning algorithms, have demonstrated remarkable efficacy in various image and signal processing tasks, rendering them well-suited for gear fault detection(Lupea and Lupea 2023).

The most common gear fault to occur in heavily loaded gear is the pitting fault, followed by wear, plastic deformation, and breakage (Engelhardt et al. 2017; Goswami and Nandan Rai 2023). Since it is last in the hierarchy of failure modes as per occurrence, most works on gears are supposedly for pitting and wear tooth faults, and few studies are readily available for broken tooth faults diagnosis (Goswami and Rai 2016; Kumar et al. 2020).(Yang and Wu 2015) introduced a methodology based on Hilbert-Huang Transform (HHT) to extract features for identifying gear faults, including wear, broken tooth, and unbalance, in spur gears. The features were extracted from time domain signals, IMF (Intrinsic Mode Functions) signals, envelope signals, and marginal envelope spectrum. The characteristics underwent Principal Component Analysis (PCA) before being utilized in an Artificial Neural Network (ANN) for flaw identification.(Feng et al. 2016) introduced non-zero ratios to examine the accumulation of frequencies. Subsequently, the Graph Fourier Transform (GFT) of a vibration signal from a gearbox is computed. At lower frequencies, an improvement in frequency accumulation was observed. Subsequently, a signal reconstruction was performed for Hilbert demodulation analysis, starting from lower orders, to identify problems in the gear and bearing. Utilizing their approach, they successfully detected gear teeth damage and outer race abnormalities in the gear.(Li et al. 2019) used data-driven time-frequency analysis on broken teeth gear and cracked teeth gear to evaluate fault in a variable speed gear setup.(Jiang et al. 2019) employed CEEMDAN and Permutation Entropy (PE) along with SVM for broken tooth fault detection in helical gears. The authors' proposed method consisting of three steps: the signals are demodulated using CEEMDAN, evaluated for PE of the demodulated signals, and finally applied SVM to classify broken teeth. In all the above-mentioned research works based on vibration signal processing and SVM-based machine learning approach, the author employed a minimum of the two-stage process for gear broken teeth fault diagnosis. Moreover, the selected feature may work in some cases and may not work for other gear faut cases. In(LI et al. 2020) authors proposed an augmented deep sparse autoencoder (ADSAE), for pitting fault severity detection. Sparse Autoencoders rely on unsupervised learning for compact representation, potentially limiting their ability to capture intricate hierarchical structures, especially in image data. Thus, in tasks requiring complex pattern recognition and spatial hierarchies, such as image classification, CNNs outperform Sparse Autoencoders. You et al.(You et al. 2020) enhanced the rectified linear unit (RELU) activation function in a convolutional neural network (CNN) to improve defect identification. The efficacy of the technique was demonstrated on a car transmission gearbox exhibiting three distinct levels of tooth wear and an instance of damaged teeth. Amin et al.(Amin et al. 2024) used cyclostationary with interpretable CNN analysis of sensor data to achieve early damage detection in wind turbine gearboxes, facilitating high accuracy fault identification even for small magnitude faults under diverse operating conditions. However, reliance on cyclostationary analysis may introduce computational complexities, potentially impeding real-time monitoring and necessitating robust computational resources. All the work as depicted above on gear broken fault diagnosis, relies on the complicated process of denoising vibration signals, deconvolving the vibration signals, selecting appropriate features for machine learning models, selecting appropriate hyperparameters for tuning deep learning models, etc. Moreover, most of the works on gear broken teeth fault detection are based on single operating conditions of gears, which might not be the case in real-life scenarios. To overcome these limitations, in this work, a simple yet effective CNN-based model is designed for broken teeth fault diagnosis. To assess the network's performance across different gear designs and operating conditions, two gear fault datasets were employed for training and testing. The first dataset included data from nine distinct operating conditions, featuring spur gears in a multi-stage gearbox subjected to various speeds and loads. In contrast, the second dataset focused solely on helical gears in a single stage gearbox operating under a single fixed condition with one specific speed and load setting. The more details about the datasets can be found in Section 3 of the manuscript.

Traditional methods like vibration signal analysis(Zhang et al. 2019; Goswami and Rai 2024a) and machine learning-based algorithm fault detection often necessitate manual feature engineering, where domain experts meticulously extract specific features from raw signals to represent fault-related characteristics (Goswami et al. 2022). This manual process can be time-consuming and may not comprehensively capture all pertinent information. The automated feature learning capability of CNNs enable them to discern complex and nuanced fault patterns that may prove challenging to identify through manual feature selection. Vibration signals in industrial contexts often manifest hierarchical and multi-scale patterns in fault signatures. Traditional signal processing methods encounter difficulties in capturing these intricate relationships effectively until subjected to further processing with appropriate denoising algorithms. In contrast, CNN architectures, equipped with convolutional layers, are expressly designed to learn hierarchical representations from data. These convolutional layers facilitate the identification of local patterns, progressively combining them to recognize higher-level features. Furthermore, the adaptability of CNNs is noteworthy, as vibration signals can vary significantly based on the type of fault, its severity, and the operational conditions of the machinery. Once trained on diverse datasets encompassing various fault conditions, CNNs exhibit a high degree of generalization, rendering them suitable for applications where different fault types may occur under varying operating conditions. The robustness of CNNs to noise is a significant advantage, as they demonstrate the capacity to focus on relevant fault patterns while filtering out irrelevant fluctuations in vibration signals. The research contribution of the paper lies in developing a computationally tractable deep learning (DL) based CNN model that can be used for training the vibration data for fault detection in various industries where gear fault detection can be done without the computational constraints and also where the data availability is sparse. The input to the CNN will be the spectrogram images generated using short time Fourier transforms (STFT). Further ,the proposed model is compared extensively with traditional machine learning (ML) models as well as DL models to substantiate the superiority of the model.

The paper is structured into five additional sections. Section 2 provides a concise overview of the theoretical foundation of the algorithms employed in the study. Section 3 explains the experimental setups and conditions of data collection. The methodology utilized in the paper is expounded in Section 4, while Section 0 is dedicated to presenting results and discussions. Lastly, Section 6 serves as the conclusion of the paper.

This section is dedicated to explaining the theoretical foundations of the models and concepts employed in this paper. The investigation incorporates three distinct facets for diagnosing broken teeth faults: vibration signal analysis, a machine learning approach based on Support Vector Machines (SVM), and a deep learning approach based on Convolutional Neural Networks (CNN). The subsequent elucidation provides a brief overview of the detailed methodology, encompassing all utilized methods, with the exception of CNN, which has already been discussed in section 1. This section also provides a concise overview of the spectrogram image data used as input for the CNN model.

2.1. Vibration Signal Analysis:

Vibration signal analysis for gears involves studying oscillations in signals generated by gear systems, with a focus on critical factors such as Gear Mesh Frequency (GMF). The analysis includes time-domain and frequency-domain techniques to extract information about gear conditions. Time domain features such as Kurtosis, Crest Factor, and RMS provide crucial insights into the shape and magnitude of vibration signals, aiding in the identification of abnormalities in machinery. These features capture variations in amplitude and waveform characteristics. Additionally, frequency domain features like Gear Mesh Frequency (GMF) pinpoint specific frequencies related to gear interactions, helping to detect faults associated with gear teeth. Integrating both time and frequency domain features enhances the effectiveness of fault diagnosis, enabling a comprehensive assessment of machinery health based on vibration analysis. Techniques like wavelet denoising are often employed to enhance fault detection accuracy, aiming to reduce noise and highlight relevant features in the vibration signals. This approach is essential for diagnosing gear faults, ensuring optimal performance, and preventing potential machinery failures through proactive maintenance.

Gear Meshing Frequency

Gear Mesh Frequency (GMF) is the fundamental frequency at which gear teeth engage and disengage in rotating machinery, representing the cyclic pattern of gear interactions. The Eq. (1) represents the GMF, where is the number of teeth on the gear and is the rotating frequency of the shaft.

$$GMF=n \times {f_s}$$

Kurtosis

Kurtosis, in the context of gear fault diagnosis, quantifies the sharpness and peakedness of the vibration signal's distribution. Higher kurtosis values indicate increased impulsiveness in the signal, aiding in the identification of abrupt changes associated with gear faults. Eq. (2) is used to calculate the Kurtosis value of a signal, where represents the number of data points in the signal, ${x_i}$represent each individual data point and $\bar {x}$ is the mean of the signal.

$$Kurtosis(K)=\frac{{\frac{1}{n}\sum\limits_{{i=1}}^{n} {{{({x_i} - \bar {x})}^4}} }}{{{{\left( {\frac{1}{n}\sum\limits_{{i=1}}^{n} {{{({x_i} - \bar {x})}^2}} } \right)}^2}}}$$

Root Mean Square (RMS)

RMS in the context of gear fault diagnosis, quantifies the square root of the average of the squared amplitudes of a vibration signal. RMS provides a measure of the signal's overall magnitude, aiding in assessing baseline vibration levels and detecting deviations associated with gear faults. Eq. (3) is used to calculate the RMS value, where represents the total number of data points and represent each individual data point

$$RMS=\sqrt {\frac{1}{n}\sum\limits_{{i=1}}^{n} {{{({x_i})}^2}} }$$

Crest Factor

Crest Factor, in the context of gear fault diagnosis, measures the ratio of the peak amplitude of a vibration signal to its root mean square (RMS) value. A higher Crest Factor suggests sharper and more impulsive characteristics in the signal, aiding in the detection of sudden changes or anomalies associated with gear faults. Eq. (4) is used to calculate the crest factor of a signal.

$$CrestFactor(CF)=\frac{{PeakValue}}{{RMSvalue}}$$

2.2. Support Vector Machine

The Support Vector Machine (SVM) models utilized in this study encompass a range of configurations designed to optimize fault identification in gear systems. SVMs are a class of supervised machine learning algorithms known for their efficacy in classification tasks. The models employed in this investigation include linear SVM, quadratic SVM, cubic SVM, fine Gaussian SVM, medium Gaussian SVM, and coarse Gaussian SVM. Each variant represents a distinct kernel function, reflecting different strategies for transforming input data into higher-dimensional spaces to enhance the separability of classes. The linear SVM employs a linear kernel, effectively creating linear decision boundaries, while the quadratic, cubic, fine Gaussian, medium Gaussian, and coarse Gaussian SVMs leverage polynomial and radial basis function (RBF) kernels to capture non-linear relationships within the data. These SVM models are systematically applied to the dataset, and their comparative performance in fault identification is scrutinized. The selection of diverse SVM configurations aims to clarify the impact of kernel functions on classification accuracy, thereby providing valuable insights into the nuanced behaviour of SVMs in the context of gear fault detection. This exploration contributes to the ongoing refinement of machine learning methodologies for robust and accurate fault diagnosis in complex engineering systems. More about SVM can be found in (Goswami et al. 2022).

2.3. Spectrograms

In gear fault diagnosis, spectrograms are crucial for visualizing the time-frequency content of vibration signals. They reveal how different frequencies evolve over time, allowing user to identify specific fault signatures like mesh frequencies and harmonics. This frequency analysis allows for the identification of various types of gear faults, including pitting, chipping, breakage and misalignment, each of which generates distinct frequency patterns in the vibration signal, which generally is missed in regular FFT analysis. Window length and overlap percentage are critical parameters during spectrogram generation. Window length influences the trade-off between frequency resolution and temporal resolution. A longer window offers better frequency resolution but reduces temporal detail. Overlap percentage between consecutive windows affects spectral smearing. Higher overlap reduces smearing but increases computational cost (Manhertz and Bereczky 2021). Selecting appropriate window length and overlap is essential for capturing characteristic fault signatures with sufficient clarity for accurate diagnosis.

This study utilizes data from two experimental setups sourced from different institutions. The first dataset originates from an in-house experimental setup developed at the Indian Institute of Technology, Kharagpur. This setup allowed the authors to generate their own data. The second dataset leverages an online repository developed by Amirkabir University of Technology.

3.1. Experimental Setup 1

Figure 1 illustrates the setup of the gearbox test rig utilized for experiments conducted at the Indian Institute of Technology, Kharagpur.

Figure 1: The Experimental Setup 1

Figure 2: The gears used in the experiment: (i) good gear (B) (ii) broken teeth gear(B)

The experimental configuration comprises a 15-kilowatt three-phase induction motor with a maximum power output of 20 horsepower and a maximum rated current of 26A. To regulate the motor's speed within the 0 to 50 Hertz (Hz) range with a precision of 0.1 Hz, a Variable Frequency Drive (VFD) is employed, given the motor's constant rotational speed of 3000 rpm. The motor is mechanically coupled to an industrial gearbox featuring a 2-stage and 4-speed design that employs spur gears. The output shaft of the gearbox interfaces with an eddy current dynamometer capable of generating a braking torque of 125Nm. The accelerometers operate at a sampling frequency of 20kHz, and data acquisition is facilitated through the National Instruments (NI-9252) Data Acquisition (DAQ) system. Figure 2 provides a visual representation of the two gears; one is in good condition, and the other has a broken tooth configuration. Figure 3 shows the connected gears for the experiment with the number of teeth as B = 23, B’=29, C = 25 and C’=20. The effective selection of sensor is done based on (Goswami and Rai 2024b).

3.2. Experimental Setup 2

Figure 4 illustrates the experimental setup devised at Amirkabir University of Technology (Zamanian and Ohadi 2011, 2016). This setup comprises an electromechanical unit featuring a gearbox connected to a 3-phase electrical motor at the input and a disk brake loading system at the output. The motor maintains a constant rotational speed of 1420rpm. Vibration data was collected using an accelerometer over a 10-second period, sampled at 10kHz.

The gearbox configuration employs a single-stage helical gearbox with a gear ratio of 7.33:1. To simulate a fault in the gear, a deliberate defect was introduced wherein 50% of the area from the top land of the pinion tooth profile to the pitch surface was removed in a straight line with a decreasing slope as shown in Fig. 5. This fault is referred to as a 'chipped tooth fault' by the authors of the experimental setup (Zamanian and Ohadi 2011, 2016). However, it could also be characterized as a 'broken teeth fault' of lesser severity, involving the chipping off of a small section of the tooth from the parent structure.

At first the experimental dataset 1 was used for analysis. Vibration signals were generated under diverse operating conditions for the experimental setup 1, as shown in section 3.1. The operating conditions are represented as nine different cases appended in Table 1. Data collection was conducted for a duration of 10 seconds for each condition, encompassing both the fault conditions of the gears—namely, good teeth condition and broken teeth condition. As the accelerometers are sampled at 20kHz, for every 10 seconds of data being collected, 0.2 million data points are generated for each of the nine cases.

Table 1

The experiment conditions used for this experiment
Cases	Speed (rpm)	Torque (Nm)	Cases	Speed (rpm)	Torque (Nm)	Cases	Speed (rpm)	Torque (Nm)
1	1000	0	4	1500	0	7	3000	0
2	1000	20	5	1500	20	8	3000	20
3	1000	40	6	1500	40	9	3000	40

The obtained vibration signals undergo conversion into Time-Frequency Resolution (TFR) maps utilizing the Short Time Fourier Transform (STFT). The selection for the window length in the STFT analysis is established at 0.1 seconds, enabling the capture of 2000 data points for the generation of each STFT image (Default 50% overlap in STFT image generation was selected). The selection of 2000 data points for analysis is motivated by the need to capture at least one gear meshing frequency (GMF) from each operating condition. Higher rotational speeds (rpm) during the experiment will result in a greater number of GMFs being recorded within the 2000 datapoint window. Additionally, each mesh between a faulty tooth and its healthy counterpart is expected to generate a transient signal with higher intensity and that will be captured by the spectrogram for faulty gear case. Importantly, meshing gear teeth are prime-numbered, every tooth on the faulty gear will mesh with a healthy tooth during each shaft rotation, ensuring consistent transient signatures within the data window. Consequently, 100 TFR images are generated for each gear condition. These images from nine different operating cases and two different sets of gear conditions add to a total of eighteen hundred (1800) images. Nine different operating conditions are used to make the CNN model robust against load and speed fluctuation, which is generally seen in gearboxes used in the industries. The methodology is shown in Fig. 6. The images are then fed to a deep learning-based CNN model to classify the faults. The proposed CNN model has seven layers: an image input layer, a convolution-2D layer, a batch normalization layer, a RELU layer, a fully connected layer, a SoftMax layer and classification layers, as shown in Fig. 7. The Fig. 8 shows the details of parameters at each level.

The same segment (2000 data points) of vibration signals is being used to evaluate kurtosis, crest factor and Root Mean Square (RMS) values and used as input features for the SVM model for gear fault classification. These three parameters are selected based on (Sharma and Parey 2016), where authors have recommended using these three parameters for gear broken teeth condition. Moreover, the vibration signal processing technique with raw vibration signal and using wavelet denoising is further analysed for visual fault detection in the frequency spectrum. All three different methods are utilized for broken tooth fault detection in gear vibration signals.

After finalising the parameters for fault detection using CNN is achieved, the same was implemented in experimental setup 2 as depicted in Section 3.2 to validate the efficacy of the developed CNN model.

As discussed, the first experimental setup data is being used for the analysis. Figure 9 presents the time domain plot and frequency domain plot of the raw vibration signals for case 7 under both gear conditions. The analysis reveals that both the time domain (TD) plot and the frequency domain (FD) spectrum plot fail to provide information pertaining to Gear Meshing Frequencies (GMF) and associated fault frequencies. Notably, the plots exhibit striking similarities between the broken tooth scenario and the configuration with intact teeth. This consistent pattern was also observed across all other cases; however, due to space constraints, not all figures are presented herein. In the experiment, the input shaft meshing gear has 23 teeth, and it is meshing with a 29 teeth gear at the intermediary shaft. Another gear at the intermediary shaft with 25 teeth is connected to the output shaft gear with 20 teeth, as seen in Fig. 3. So, considering case 7, i.e., the input speed of 3000rpm and no-load condition, the frequency spectrum should show peaks at gear meshing frequencies at 1150Hz and 991Hz, whereas the maximum peak is seen at 200 Hz and 1202Hz, 2343Hz from Fig. 9 which cannot be considered as fault frequencies for the broken teeth condition. Since no GMF frequencies are evident in the frequency spectrum plot, the subsequent action involves denoising the signal to enhance its information content by separating it from noise.

The initial analysis of the raw vibration signal proved inefficient in detecting fault information. Consequently, a wavelet denoising approach was employed to enhance the vibration signal. The process commenced with detrending the signals through a linear filter, followed by the application of the Symlet-4 wavelet at an 8th level. The selected denoising method was Bayes with the median rule. The resulting denoised vibration signal as depicted in Fig. 10, exhibited a notable reduction in noise compared to its pre-denoised state (Fig. 9). The Signal to Noise Ratio (SNR) for case 7 with good teeth was found to be 5.86 and for the broken teeth was 6.61. Examination of the power spectrum revealed elevated amplitudes at higher frequency levels for the broken tooth, contrasting with the predominant lower frequency amplitudes observed in the case of intact teeth. It is noteworthy that the energy levels in the power spectrum significantly moderated for the broken tooth scenario compared to the healthy teeth conditions.

To provide a more nuanced representation of the broken tooth fault, spectrogram plots were generated. Despite the effort, the spectrogram images failed to yield clear indications, as the broken teeth did not exhibit discernible effects on specific frequency ranges, as seen in (Vernekar et al. 2014). The high-amplitude frequencies associated with the broken teeth were dispersed throughout the spectrum, hindering the identification of a distinct fault frequency, as anticipated in Fig. 10.

Since traditional signal processing techniques based on raw vibration signals and then denoised vibration signals using wavelets failed to detect broken tooth faults, SVM-based and CNN-based fault detection methods, as proposed, were employed. The segments of 2000 data points, as used for STFT image generation, the same segment is being used for time domain statistical parameter evaluation. Kurtosis, Crest factor, and RMS are the three-time domain statistical parameters used as input features to the SVM models for fault identification of the gears. In this paper, we utilized linear SVM, quadratic SVM, cubic SVM, fine Gaussian SVM, medium Gaussian SVM and coarse Gaussian SVM on the dataset. The SVM model with the best classification accuracy is shown in Fig. 11. It can be seen from Fig. 8 that using all three parameters as input features, i.e., Kurtosis, Crest Factor and RMS, the Fine Gaussian model gave the best classification accuracy at 87.3%, and the Linear SVM gave the worst classification accuracy with 56.6% accuracy. For individual cases, i.e., only using Kurtosis as the input feature to the SVM model, Coarse Gaussian SVM gave the best result with 57.2% accuracy, while using Crest factor as the input feature, Coarse Gaussian SVM produced the best accuracy in result with 56.1% and by using RMS as input feature to SVM model, fine Gaussian SVM gave the best accuracy with 90.7% accuracy. The confusion matrix for all the four features is shown in Fig. 11. From Fig. 8, it can be concluded that RMS has proved to be the best-suited feature for broken tooth fault detection while employing the Fine Gaussian SVM model.

As the SVM-based classification result produced the best classification accuracy of 90.7%, the next proposed CNN model is tested on the segmented vibration signals transformed into spectrogram images. The input to the proposed CNN model was such that 80% of the total dataset (i.e., 720 images from each case, i.e. broken tooth and good teeth) are being used for training the CNN model in a randomized order. It is to be noted here that the input size of the images was 656×875 RGB. The image size was selected as bigger than usual due to the fact that the spectrogram images for both the broken tooth case and the good teeth case look similar, as shown in Fig. 12, so using smaller size images leads to poor training of the CNN model.

The proposed deep learning-based CNN model was able to classify the broken tooth fault with an overall accuracy of 98.6%; the same can be seen in Fig. 14. The confusion matrix for the classification accuracy can be seen in Fig. 15, where a total of 360 images (the remaining 20% of the dataset of 1800 images) are put to the test, out of which the trained model classified five good teeth images as broken teeth and was able to classify the broken teeth with 100% accuracy. Figure 13 represents accuracy of all the tested SVM models with different input features and also the proposed CNN model.

For experimental setup 1, the model performed splendidly, but its efficacy needs to be checked for the other experimental setup data, i.e., experimental setup 2 where the fault severity is very less and data availability is also sparse. The data generated for the experimental setup 2 for binary fault class, where chipped off broken tooth is one class and the normal teeth gear data is the other class. As data was collected for just ten seconds with 10kHz sampling rate, 0.1 million data points are being generated at once which makes the data points considerably low to generate spectrogram images. To overcome this issue, overlapping of the vibration signal is used for generating spectrogram images. With low data points available, spectrogram image generation with 2000 data points with no overlapping led to generation of only 50 images and with such less images the CNN model couldn’t learn the intricate patterns leading to an accuracy of 40.8%. In order to increase the number of images being generated, the signal is segmented into three smaller segment series of 2000, 1000 and 500 data points with overlapping percentages of 50 and 80. The details of number of images being generated by segmenting into smaller length with overlap can be seen in Table 2. Reducing the number of data points below 500 leads to poor training of the CNN model, as not a single GMF will be recorded below 423 data points.

Table 2

Signal segmentation details with STFT image parameters and it’s impact on accuracy
Fault Case	Segment Length	Overlap Percentage	STFT Window Size	STFT Window Overlapping	Generated Images	Training Set	Testing Set	Accuracy (%)
Good Teeth	2000	0	200	50	50	40	10	40.8
	2000	50	1000	50	99	79	20	53.2
	2000	80	200	80	246	196	100	84.6
	1000	80	100	80	496	396	100	91
	500	80	50	80	996	796	200	93
Broken Teeth	2000	0	200	50	50	40	10	40.8
	2000	50	100	80	99	79	20	53.2
	2000	80	200	80	246	196	100	84.6
	1000	80	100	80	496	396	100	91
	500	80	50	80	996	796	200	93

The importance of segmentation size with overlap plays a crucial role in the CNN model efficacy where data availability is low. Higher the number of images, better is the training of CNN model and hence better classification accuracy. Figure 17 shows the confusion matrix where maximum classification accuracy of 93% is attained. Observations from Fig. 16 suggest that the spectrogram images for good and broken teeth exhibit high visual similarity. This can be attributed to the minimal severity of the seeded damage in the broken tooth, resulting in its spectral characteristics closely resembling those of healthy teeth.

Figure 18 illustrates the potential similarity between spectrogram images from different clusters using k-Nearest Neighbours (k-NN) clustering with a minimum Euclidean distance metric. Here, PCA dimensionality reduction was applied to the original high-dimensional features (3792 x 574000) to a 3D space for visualization purposes.

Figure 18 presents the results of principal component analysis (PCA) applied to the extracted features, visualized as clusters in a 3D space. PCA dimensionality reduction to normalized data was employed to facilitate visualization in this three-dimensional plot. The figure reveals a distinct separation between Cluster 2, corresponding to broken teeth images from experimental setup 1, and the remaining clusters. Clusters 1, 3, and 4 (good teeth of experimental setup 1, good teeth and broken teeth of experimental setup 2 respectively) and appear more closely grouped. This suggests that the features of good teeth from both experimental setups (1 & 2) and chipped broken teeth from setup 2 are similar. Moreover, the cluster 3 and cluster 4 almost overlap each other leading to difficulty in classification for the experimental setup 2 data.

Apart from the number of images being used for training, the batch size parameter of CNN also plays a very crucial role in the training of the model. If the segmented vibration signal has smaller datapoints as in experimental setup 2, a lower batch size of images is required per epoch, conversely if the datapoints are enough as for the case of experimental setup 1 higher batch size could be used. The maximum classification accuracy for experimental setup 1 is achieved using a batch size of sixty-four, whereas for experimental setup 2 a batch size of fifteen is used per epoch.

The proposed CNN model is evaluated against existing deep learning models, such as VGG16, VGG19, AlexNet, and ResNet, which have been trained on extensive image datasets. The results of this comparison are presented in Table 3. The lower data training and testing accuracy achieved by these established models can be attributed to their reliance on transfer learning for feature extraction. These deep networks were originally trained on real-world image datasets for object recognition tasks, and their architectures may not be well-suited for processing spectrogram images. Applying transfer learning to spectrogram images may not enable them to fully capture the relevant features, leading to diminished accuracy. Additionally, training these pre-trained deep networks necessitates resizing the input spectrogram images. This resizing process can significantly reduce the information content within the images, further hindering their performance. Moreover, the number of images required as input to these pre-trained CNN models are quite high in contrast to the images generated for our case, resulting again in lower accuracy. The comparative analysis graph for all the models are shown in Fig. 19.

Table 3: Comparison Table of CNNs

CNN models	Accuracy (dataset 1)	Accuracy (dataset 2)
Proposed CNN	98.6	93
VGG16	74.2	61.8
VGG19	72.8	64.4
Alexnet	82.4	79.2
ResNet	75.6	66.6

This paper explores Convolutional Neural Networks (CNNs) as potent tools for gear fault detection, addressing the limitations of traditional methods like vibration signal processing and Support Vector Machine (SVM) classification. The initial visual analysis of the raw vibration signal proved ineffective in detecting fault information related to Gear Meshing Frequencies (GMF) and associated fault frequencies across different gear conditions. As a result of consistent similarity in plots between the broken tooth and intact teeth scenarios, the visual classification through signal processing methods, including wavelet denoising and spectrogram generation, failed to yield clear indications. SVM-based and CNN-based fault detection methods were employed in response to this limitation. For SVM models, Kurtosis, Crest Factor, and RMS are utilized as input features. The Fine Gaussian SVM model demonstrated the highest classification accuracy at 87.3%, while the Linear SVM model exhibited the lowest accuracy at 56.6%. Individual feature analysis revealed that RMS proved to be the most compelling feature for broken tooth fault detection, achieving 90.7% accuracy with the Fine Gaussian SVM model. Subsequently, the proposed CNN model, trained on segmented vibration signals transformed into spectrogram images, exhibited superior performance. With an input size of 656×875 RGB, the CNN model achieved an impressive classification efficiency of 98.6% in distinguishing between broken tooth and intact teeth cases for experimental setup 1 and 93% for experimental setup 2. The significant advantages of CNN based gear fault detection lie in:

Fine-tuning and adapting signal processing methods to different fault conditions or operating environments often require manual adjustments, like denoising the signal of noise, selecting the right tools for demodulation effect, etc. This can be time-consuming and may not guarantee optimal performance across various scenarios. Vibration signal processing struggles with generalization across diverse operating conditions, different types of faults, or variations in equipment, as seen in our case.
For machine learning-based SVM models for fault classification, the most important deciding factor is the selection of features. As seen in our case, the accuracy of fault detection ranges from 56.1–90.7% depending on the feature used and the model of SVM being selected for fault classification.
To the contrary, the CNN model demonstrated a fault detection accuracy of 98.6% across nine distinct operating cases without any preselected features for experimental setup 1 and 93% for the experimental setup 2. This implies that the CNN model maintains robust performance consistently, making it suitable for application across diverse operational conditions with high accuracy all the time.

This underscores the effectiveness of deep learning approaches, particularly CNNs, in fault detection for gear systems, surpassing traditional signal processing methods in accuracy and reliability. The only bottleneck is the data availability for training the CNN model, which can be taken care of by vibration signal segmentation with overlap as demonstrated in the case of experimental dataset 2. Once trained, the model can be successfully implemented in any industry for broken tooth fault detection as demonstrated in the paper using two different experimental setups with different operating condition and level of severity. The future scope of CNN-based fault detection lies in hyperparameter tuning of the CNN model to further improve the model's accuracy.

Declaration of Interest

All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.

Funding

This research received no specific grant from any funding agency in public, commercial or not-for-profit sector.

Amin A, Bibo A, Panyam M, Tallapragada P (2024) Wind Turbine Gearbox Fault Diagnosis Using Cyclostationary Analysis and Interpretable CNN. Journal of Vibration Engineering & Technologies 12:1695–1705. https://doi.org/10.1007/s42417-023-00937-1
Engelhardt C, Witzig J, Tobie T, Stahl K (2017) Influence of Defined Water Contents in Gear Lubricants on the Pitting Performance of Case-Carburized Gears. In: Volume 10: 2017 ASME International Power Transmission and Gearing Conference. American Society of Mechanical Engineers
Feng Z, Lin X, Zuo MJ (2016) Joint amplitude and frequency demodulation analysis based on intrinsic time-scale decomposition for planetary gearbox fault diagnosis. Mech Syst Signal Process 72–73:223–240. https://doi.org/10.1016/j.ymssp.2015.11.024
Goswami P, Nandan Rai R (2023) A systematic review on failure modes and proposed methodology to artificially seed faults for promoting PHM studies in laboratory environment for an industrial gearbox. Eng Fail Anal 146:107076. https://doi.org/10.1016/j.engfailanal.2023.107076
Goswami P, Rai RN (2016) A Novel Methodology for Chipped Teeth and Worn Tooth Gear Fault Detection Using Minimum Entropy Deconvolution and Ceemdan. Available at SSRN 4556868
Goswami P, Rai RN (2024a) Efficient Vibration Signal Denoising Techniques for Effective Condition Monitoring of Gearbox. In: 2024 11th International Conference on Signal Processing and Integrated Networks (SPIN). IEEE, pp 427–432
Goswami P, Rai RN (2024b) Principal Component Analysis Based Vibration Sensor selection for Fault Diagnosis of an Industrial Gearbox. In: 2024 IEEE International Instrumentation and Measurement Technology Conference (I2MTC). IEEE, pp 1–6
Goswami P, Sahu PK, Rai RN (2022) An Optimum Segmentation of Gear Vibration Signals for an Effective Fault Classification Using Time-Domain Feature and Multi-class Support Vector Machines. pp 299–312
Hu P, Zhao C, Huang J, Song T (2023) Intelligent and Small Samples Gear Fault Detection Based on Wavelet Analysis and Improved CNN. Processes 11:2969. https://doi.org/10.3390/pr11102969
Jiang L, Tan H, Li X, et al (2019) CEEMDAN-Based Permutation Entropy: A Suitable Feature for the Fault Identification of Spiral-Bevel Gears. Shock and Vibration 2019:1–13. https://doi.org/10.1155/2019/7806015
Kumar A, Gandhi CP, Zhou Y, et al (2020) Latest developments in gear defect diagnosis and prognosis: A review. Measurement 158:107735. https://doi.org/10.1016/j.measurement.2020.107735
Li F, Li R, Tian L, et al (2019) Data-driven time-frequency analysis method based on variational mode decomposition and its application to gear fault diagnosis in variable working conditions. Mech Syst Signal Process 116:462–479. https://doi.org/10.1016/j.ymssp.2018.06.055
LI X, LI J, QU Y, HE D (2020) Semi-supervised gear fault diagnosis using raw vibration signal based on deep learning. Chinese Journal of Aeronautics 33:418–426. https://doi.org/10.1016/j.cja.2019.04.018
Liu C, Cheng G, Chen X, Pang Y (2018) Planetary Gears Feature Extraction and Fault Diagnosis Method Based on VMD and CNN. Sensors 18:1523. https://doi.org/10.3390/s18051523
Loganathan MK, Goswami P, Bhagawati B (2016) Failure Evaluation and Analysis of Mechatronics-Based Production Systems during Design Stage Using Structural Modeling. Applied Mechanics and Materials 852:799–805. https://doi.org/10.4028/www.scientific.net/AMM.852.799
Lupea I, Lupea M (2023) Detecting Helical Gearbox Defects from Raw Vibration Signal Using Convolutional Neural Networks. Sensors 23:8769. https://doi.org/10.3390/s23218769
Manhertz G, Bereczky A (2021) STFT spectrogram based hybrid evaluation method for rotating machine transient vibration analysis. Mech Syst Signal Process 154:107583. https://doi.org/10.1016/j.ymssp.2020.107583
Patil A, Soni G, Prakash A (2024) Data-driven approaches for impending fault detection of industrial systems: a review. International Journal of System Assurance Engineering and Management 15:1326–1344. https://doi.org/10.1007/s13198-022-01841-9
Sharma V, Parey A (2016) A Review of Gear Fault Diagnosis Using Various Condition Indicators. Procedia Eng 144:253–263. https://doi.org/10.1016/j.proeng.2016.05.131
Vashishtha G, Chauhan S, Kumar S, et al (2023) Intelligent fault diagnosis of worm gearbox based on adaptive CNN using amended gorilla troop optimization with quantum gate mutation strategy. Knowl Based Syst 280:110984. https://doi.org/10.1016/j.knosys.2023.110984
Vernekar K, Kumar H, Gangadharan KV (2014) Gear Fault Detection Using Vibration Analysis and Continuous Wavelet Transform. Procedia Materials Science 5:1846–1852. https://doi.org/10.1016/j.mspro.2014.07.492
Yang CY, Wu TY (2015) Diagnostics of gear deterioration using EEMD approach and PCA process. Measurement 61:75–87. https://doi.org/10.1016/j.measurement.2014.10.026
You W, Shen C, Wang D, et al (2020) An Intelligent Deep Feature Learning Method With Improved Activation Functions for Machine Fault Diagnosis. IEEE Access 8:1975–1985. https://doi.org/10.1109/ACCESS.2019.2962734
Zamanian AH, Ohadi A (2016) Gearbox Fault Detection through PSO Exact Wavelet Analysis and SVM Classifier. https://doi.org/10.13140/RG.2.1.4983.3442
Zamanian AH, Ohadi A (2011) Gear fault diagnosis based on Gaussian correlation of vibrations signals and wavelet coefficients. Appl Soft Comput 11:4807–4819. https://doi.org/10.1016/j.asoc.2011.06.020
Zhang X, Zhao J, Ni X, et al (2019) Fault diagnosis for gearbox based on EMD-MOMEDA. International Journal of System Assurance Engineering and Management 10:836–847. https://doi.org/10.1007/s13198-019-00818-5

Download PDF

Reviewers agreed at journal
26 Aug, 2024
Reviewers invited by journal
25 Aug, 2024
Editor invited by journal
21 Aug, 2024
First submitted to journal
20 Aug, 2024

You are reading this latest preprint version

Broken Tooth Gear Fault Detection Using Vibration Signals Based on Convolutional Neural Network

Status:

Version 1

Abstract

Figures

1. Introduction

2. Theoretical Background

2.1. Vibration Signal Analysis:

2.2. Support Vector Machine

2.3. Spectrograms

3. Experimental Setup

3.1. Experimental Setup 1

3.2. Experimental Setup 2

4. Methodology

5. Results and Discussion

6. Conclusion

Declarations

References

Status:

Version 1