Sensor fusion is becoming increasingly popular in condition monitoring. Many studies rely on a fusion-level strategy to enable the most effective decision-making and improve classification accuracy. Most studies rely on feature-level fusion with a custom-built deep learning architecture. However, this may limit the ability to use the widely available pre-trained deep learning architectures available to users today. This study proposes a new method for sensor fusion based on concepts inspired by image fusion. The method enables the fusion of multiple and heterogeneous sensors in the time-frequency domain by fusing spectrogram images. The method's effectiveness is tested with transfer learning (TL) techniques on four different pre-trained convolutional neural network (CNN) based model architectures using an original test environment and data acquisition system. The results show that the proposed sensor fusion technique effectively classifies device faults and the pre-trained TL models enrich the model training capabilities.