Bearings are an indispensable supporting element in mechanical systems, and their running status is directly related to the overall operational performance of the equipment [1]. However, due to the adverse operating environment, the risk of bearing failure increases significantly. Such bearing faults would reduce the operational reliability and even cause disastrous accidents [2–4]. Therefore, it is indispensable to develop a performance-enhancing fault diagnosis approach for promptly and accurately identifying bearing faults to ensure smooth mechanical equipment operation [5, 6]. With the unprecedented advancement in the domain of information technology, the approaches based on in-depth learning have seen a dramatic rise in popularity and are increasingly being utilized in the domain of industrial process fault diagnosis. However, many conventional fault diagnosis approaches depend on the assumption of independent, identically distributed training and test sets. This implies that the training and test sets are made up of the vibration data measured under identical working conditions. In practical engineering applications, mechanical equipment generally runs with different operating parameters to adapt to varying task requirements. This leads to variable operating conditions (including radial force, speed, load torque, etc.) for bearings, and there are limitations to the application of conventional deep learning-based methods [7–9]. Therefore, it is obliged to develop a diagnosis method with strong generalization, which can still perform well in one working condition after training in another working condition.
The adversarial-based domain adaptation strategy, also known as the domain adversarial neural network (DANN) approach, can effectively achieve accurate classification of fault types under a variety of operating conditions. It employs an adversarial learning mechanism to decrease the differences in the cross-domain distributions and further extract the domain-independent features that are invariant within the domain [20–23]. Grain et al. [24] first introduced the adversarial mechanism into the domain adaptation method to propose the DANN method, and it has been successfully applied for sentiment analysis and image recognition tasks. Mao et al. [25] integrated the powerful adaptation of DANN networks and structured correlation information between multiple fault types to construct a new loss function with discriminative regularizers, enhancing the effectiveness of transfer learning. Chen et al. [26] developed a collaborative diagnosis framework that integrates domain knowledge from multiple sources by combining the edge confrontation module and an internal confrontation mechanism. Different from the metric-based method, DANN utilizes neural networks and adversarial training to achieve adaptive feature alignment. Nevertheless, the serviceability of the DANN method is heavily dependent on the quality of raw data or the discrepancies in the distribution of data across dissimilar domains in fault diagnosis tasks. For some complex diagnosis tasks, the utility of the DANN method needs to be further improved to extract the domain-independent features with rich discriminative information.
In the realm of fault diagnosis, numerous studies have manifested the outstanding performance of the CNN (Convolutional Neural Network) for extracting the local or spatial features of vibration data. The FPN (feature pyramid network) [27] model, constructed based on CNN, is essentially a hierarchical neural network. Its top-down architecture with lateral connections significantly enhances the model’s capabilities to mine abstract features. However, CNN is unable to optimally leverage the temporal characteristics of vibration data, which is the most significant characteristic of the vibration data in comparison to other data types [28], such as images. The LSTM (Long Short-Term Memory) and its improved versions are variants of the RNN (Recurrent Neural Network), which possess the capability to address the gradient explosion/vanishing problem encountered by RNNs and extract hidden temporal features from input data [29, 30]. This means that the LSTM and its improved versions are suitable for handling data with significant temporal characteristics. Hence, a hybrid network combining CNN and BiLSTM is constructed to form the CNN-BiLSTM module, which allows for comprehensive extraction of temporal and spatial features.
In conclusion, this paper proposes a new hierarchy-based domain adversarial neural network (H-DANN) method to meet the challenge of bearing fault diagnosis under variable operating environments. The H-DANN model is primarily comprised of three modules, i.e., a hierarchy-based feature extractor, a fault classifier and a domain discriminator. The hierarchy-based feature extraction is primarily constructed on the basis of the FPN model, and uses its hierarchical neural networks to extract multi-scale fault characteristics. A constructed CNN-BiLSTM module that can extract the spatio-temporal features is integrated into the FPN model, enhancing its feature extraction capabilities. For the fault classifier, it is employed to process the features obtained by the feature extractor to identify the different fault types. The contributions of this article are as follows:
1) The CNN-BiLSTM network is constructed and added to the FPN model to form a hierarchy-based feature extractor, which can mine the multi-scale spatio-temporal features from the raw vibration data.
2) A novel the H-DANN method, based on a modified FPN model and DANN model, has been developed in this paper to capture rich domain-independent features to realize the bearing fault diagnosis under different running environments.
The rest of this article is structured as follows. Relevant theories are briefly introduced in Section 2. The fault diagnosis architecture of the H-DANN model, developed in Section 3, is detailed. In Section 4, the case studies were conducted on the Case Western Reserve University (CWRU) and PU (Paderborn University) datasets. Section 5 summarizes the full text.