In this study, a novel neural network model, CaHDC-RGA, was designed based on deep learning algorithms by introducing ROI-guided attention. The developed DL model was used to assess the image quality of the gray-level and sharpness for chest radiographs automatically. The results showed that the performance of the model is comparable with the subjective perception of radiologist experts and the assessment time of the model is significantly shorter than that of radiologists.
Image quality assessment (IQA) has been an important topic for many years in radiology, which can identify and address the causes of poor image quality[23, 24]. In 1996, the European Union published radiology image quality criteria to unify the practices in Europe. However, many studies have found that image criteria are not objective and intero-bserver variation remains a significant problem in IQA, and the Cohen's weighted kappa values between observers ranged from 0.32 to 0.46[25–27]. In routine clinical practice of IQA, the observer often relies only on individual experience and subjective perception of the image, with no standard image reference information, such as quality criteria. This approach is more operational and efficient for clinical practice[28]. In this study, in order to reduce the potential impact caused by interobserver variation, 10 experienced experts participated in the evaluation, including 5 radiologists and 5 radiographers. The mean opinion score (MOS) was uesd as the final label for image quality. In addition, unlike most previous studies that used a 5-point scale of preference (qualitative assessment), we used a scoring system ranging from 1 to 10 point because it allows for a quantitative assessment for image quality[29–31]. The results showed that the expert's subjective perception of image quality could be learned and quantified by deep learning algorithms, which can accurately output quantitative scores for images.
Previous studies have attempted to assess image quality with physical metrics such as DQE or the MTF[32]. Although the metrics were objective enough, they only measured the system performance rather than the radiologists’ perception of clinical image quality. Recently, some CNN-based IQA has been proposed in the natural images, but they did not take full advantage of the perceptual properties of the human visual system[33–38]. In 2020, Wu et al.[16] proposed an end-to-end cascaded framework, called CaHDC, which can jointly optimize the feature extraction procedures, hierarchical degradation concatenation, and quality prediction. In this study, we designed the CaHDC-GRA by introducing ROI (region of interest) guided attention to focus on low-quality images. Our networks can evaluate hierarchical quality degradation because the number of network parameters was significantly decreased, the optimization of the network was also sped up while alleviating overfitting. The features were integrated with convolutional operations by downsampling them to the same scale, the number of features in this network is reduced, and the spatial information is preserved.
The chest radiographs may need to be retaken for the poor image quality, so it is necessary to determine whether the image quality is acceptable after image acquisition[2]. However, such a decision may be difficult for radiologists and radiographers with little experience. The automated assessment tool would serve as an advisor to improve the efficiency of the clinical work. Several CNN-based models have been proposed to evaluate the position of chest radiographs. Berg et al.[39] used the CNN to detect lung borders, spines, medial clavicular margins, ribs, and diaphragms with a typical 2.5-2.8 mm error. Nousiainen et al.[14] trained ResNet50 and DenseNet121 networks for assessing lung inclusion, rotation, and inspiration. To our knowledge, there is no automated tool for evaluating the degree of visualization in chest radiographs. In this study, a deep learning model has been trained to assess the gray-level and sharpness of chest radiographs. The results showed that the DL model could accurately simulate the subjective perception of radiologists. When the model was set to the optimal cutoff, it showed high sensitivity and specificity for classifying image quality as acceptable or non-acceptable. In future work, we will aim to identify the specific causes of non-acceptable to perform quality analysis better.
The increasing clinical demands on radiographers and radiologists make it imperative to operate as efficiently as possible to provide the highest quality of patient care[1]. A robust automated image quality assessment algorithm would be helpful within the clinic. For example, computer-based algorithms can analyze image data and provide technicians instant quality assurance (QA) feedback[40]. Automatic QA in teleradiology would ensure assistance in reaching a diagnosis and deciding the best clinical management of the patient[41]. Big data analysis of image quality can be automatically constructed into a QA database, which can be used for trend analysis, education and training, and radiation dose reduction[4]. Our study demonstrated that deep learning algorithms could be used to assess the image quality of chest radiographs rapidly and automatically. The algorithms may be used for fully-sample audit in QC programs and be applied to other examination parts or modalities in the future.
Our study had several limitations. First, although the datasets used were selected from three medical institutions and the sample size was larger than previous studies, it is still not considerable for the deep learning algorithm. Except for training DL models with real clinical images, in further studies, we can artificially add noise or use phantoms to improve the performance of the models. Second, since posterior- anterior chest radiographs in adults are the most common clinical examination, our study did not include lateral, bedside, and pediatric chest radiographs. Future studies of images in more scenarios are necessary. Finally, similar to traditional image assessment methods, our system does not provide information beyond chest radiograph quality parameters, such as the presence or absence of disease in the images. In a follow-up study, we will further incorporate more image quality chest radiographs to improve the robustness and generalizability of the model. We will also apply the developed DL model to clinical examinations, analyze the effectiveness of the developed DL model for timely quality feedback to radiographers, and test the performance of the model in a large-scale image quality audit.