Automated fault diagnosis algorithms based on vibration sensor recordings play an important role in determining the state of health of the machines. Data-driven approaches require a large number of supervised and labelled samples to build reliable models. The performance of such lab-trained models when deployed in practical use cases largely depends on their domain generalisation capability towards distinct distribution target domain datasets. These challenges are addressed to some extent by conventional parameter-based transfer learning techniques that fine-tune deeper (dense) layer parameters of the pre-trained networks when lower (convolutional) layers can capture more domain-specific information concerning the changing target domain distribution. In this work, we present a novel deep transfer learning strategy that utilises fine-tuning the parameters of convolutional layers and transfer of knowledge in the dense layers from the source domain for efficient domain generalisation and fault classification. The performance of this strategy is evaluated with two different target domain data sets, by studying the sensitivity of fine-tuning individual layers in the networks using time-frequency representations of the vibration signals (scalograms) as inputs. We observe that this strategy yields near-perfect accuracy, even for use cases where low-precision sensors are used for data collection, and unlabelled run-to-failure data with a limited number of training samples.