3.1. PTZC and BPH could be classified with Alex-Net DCNN
Of the 106 BPH patients, 30 BPH patients were randomly selected as the counterpart of the 30 PTZC patients. Images of the 60 selected patients were defined as our target dataset to train and test an Alex-Net DCNN. The 60 patients in target dataset were randomly divided into a test set (20 patients, 110 pictures) and a training set (40 patients, 2320 pictures after data augmentation) (Fig. 4a-iii). Then, 5 Alex-net models were trained through a 5-fold cross-validation method. In each training procedure, 4/5 data in training set was used to train the model, while the rest 1/5 was used as the validation set to select the optimal model. After that, the test set was used to testify the efficacy of those 5 models, and for each image, 5 probabilities (to be malignant) were derived. In the end, 5 probabilities of each image were averaged to calculate the final output value.
Even with the small sample size, PTZC and BPH can be distinguished using DCNN model (Fig. 4b, Without TL modal). Using only PTZC and BPH data, T2WIs were associated with the AUC of 0.73 (95% CI = 0.63–0.83), as well as the sensitivity, specificity and accuracy of 69%, 75% and 81%, respectively. ADC images were associated with the AUC of 0.94 (95% CI = 0.90–0.99), as well as the sensitivity, specificity and accuracy of 84%, 97% and 89%, respectively. The diagnostic efficacy of Alex-Net DCNN model using ADC images was quite satisfying, but that using T2WI needed to be improved further.
3.2. The performance of TL from natural pictures (ImageNet) is limited by small data size
An Alex-Net DCNN model were pre-trained with 1.2 million natural color pictures of ImageNet (Fig. 4a-ii)[28], and then, it was fine-tuned by using aformentioned target data (60 PTZC and 60 BPH patients).
Using pre-trained model with ImageNet (Fig. 4b, TL-ImageNet), T2WIs were associated with the AUC of 0.75 (95% CI = 0.65–0.84), as well as the sensitivity, specificity, and accuracy of 76%, 73% and 75%, respectively. ADC images were associated with the AUC of 0.96 (95% CI = 0.90–0.99), as well as the sensitivity, specificity, and accuracy of 84%, 97% and 89%, respectively. TL from ImageNet resulted in slight improvement of efficacies based on T2WIs (P = 0.17) and ADC images (P = 0.07) than those without TL.
3.3. TL from disease-related images (PTZC images) improved the diagnostic efficacy of DCNN model
Another TL model was pre-trained with images of the rest 76 BPH and 81 PTZC patients (Fig. 4a-i). This pre-trained model was fine-tuned with the aforementioned target dataset (60 PTZC and 60BPH patients).
Using the model trained from the disease-related dataset (Fig. 4b, TL-Related dataset), T2WIs were associated with the AUC of 0.86 (95% CI = 0.79–0.93), as well as the sensitivity, specificity, and accuracy of 90%, 69% and 80%, respectively. The diagnostic efficacy of TL-Related dataset model was significantly higher than that of Without TL model (P = 0.00014) or TL-ImageNet model (P = 0.00046).
ADC images were associated with the AUC of 0.97 (95% CI = 0.90–0.99), as well as the sensitivity, specificity, and accuracy 90%, 94% and 92%, respectively. However, there was no significant difference between AUCs of TL-Related dataset model and TL-ImageNet model (P = 0.88), or between TL-Related dataset model and Without TL model (P = 0.29).
3.4. Transductive method is a novel and effective way for TL
A transductive Google-Net and a transductive Alex-Net models were separately trained with the TL-Related dataset (images of 81 PPZC and 76 BPH patients), and the models were directly used to classify all PTZC and BPH images (Fig. 3c). Finally, ROC curves and the AUCs were obtained (Fig. 5).
Using the transductive Google-Net model, T2WIs were associated with the AUC of 0.86 (95% CI = 0.83–0.89), as well as the sensitivity, specificity, and accuracy of 84%, 73% and 79%, respectively. ADC images were associated with the AUC of 0.98 (95% CI = 0.97–0.99), as well as the sensitivity, specificity, and accuracy of 94%, 92% and 93%, respectively.
Using the transductive Alex-Net model, T2WIs were associated with the AUC of 0.89 (95% CI = 0.86–0.91), as well as the sensitivity, specificity, and accuracy of 81%, 82% and 81%, respectively. ADC images were associated with the AUC of 0.98 (95% CI = 0.97–0.99), as well as the sensitivity, specificity, and accuracy of 97%, 93% and 95%, respectively.
Because each lesion may have multiple planes, ensembles were performed by averaging the output values of these planes to get a stable predicting output. Of these 20 patients, 2 patients were misdiagnosed with images of T2WI, while in ADC images, only 1 patient was misdiagnosed (Fig. 6).