Heart sound auscultation plays a crucial role in the early diagnosis of cardiovascular diseases. In recent years, great achievements have been made in automatic heart sound classification; however, most methods are based on segmentation features and traditional classifiers, and do not fully use existing deep networks. This paper proposes a cardiac audio classification method based on image expression of multidimensional features (CACIEMDF). First, a 102-dimensional feature vector is designed by combining the characteristics of heart sound data in the time, frequency and statistical domains. Based on the feature vector, a two-dimensional feature projection space is constructed by PCA dimensionality reduction and the convex hull algorithm, and 102 pairs of coordinate representations of the feature vector in the two-dimensional plane are calculated. Each one-dimensional component of the feature vector corresponds to a pair of 2-d coordinate representations. Finally, the one-dimensional feature component value and its divergence from the two categories are used to fill the three channels of a color image, and a Gaussian model is used to dye the image to enrich its content. The color image is subsequently sent to a deep network such as ResNet50 for classification tasks. In this paper, three public heart sound datasets are fused, and experiments are conducted using the above methods. The results show that the proposed method can achieve a classification accuracy of 95.72% for the two-classification task of heart sounds when combined with the current deep network.