Faults identification plays a vital role in improving the safety and reliability of industrial machinery. Deep learning has stepped into the scene as a promising approach for detecting faults, showcasing impressive performance in this regard. However, challenges such as noise and variable working conditions often limit the effectiveness of these approaches. This study addresses these limitations by employing a combination of signal processing methods and neural networks. Specifically, the proposed methodology incorporates maximum overlapping discrete wavelet packet decomposition (MODWPD) for raw vibratory signal, mel frequency cepstral coefficient mapping (MFCC) for time-frequency feature extraction, and a fusion of bidirectional long and short-term memory network with convolutional neural networks (CNN-BiLSTM) to capture local features and temporal dependencies in sequential data. The evaluation is conducted using two diverse experimental datasets, PHM2009 for mixed defects and Case Western Reserve University (CWRU) for bearing faults, under unexpected operating conditions. The proposed method is rigorously tested through stratified K-fold cross-validation, demonstrating superior performance compared to a leading state-of-the-art model.