The retrospective study was approved by the ethics committee of our hospital. Written informed consent was waived by the Ethics Commission of the designated hospital for emerging infectious diseases.
2.1 Patients and Data collection
The patients diagnosed with COVID-19 pneumonia were retrospectively retrieved for the period form 15 January 2020 to 15 March 2020 with CT follow-up. The inclusion criteria were as followed: 1) having an epidemiological history; 2) real-time reverse-transcriptase polymerase-chain-reaction detection of SARS-CoV-2 nucleic acid positive; 3) having thin-section CT with abnormal manifestations. Based on the guidelines for diagnosis and treatment protocols from the National Health Commission of the People’s Republic of China [7], the moderate patients present with fever, respiratory symptoms, and radiographic features, and the severe/critical (hereinafter, severe) patients meet one of the following criteria: 1) dyspnea, respiratory rate >30 breaths /min, 2) standard oxygen saturation < 93%, 3) PaO2/FiO2 < 300 mmHg, 4) respiratory failure, 5) septic shock, 6) multiple organ failure.
Patients’ demographic, epidemiological, clinical characteristics and laboratory findings were collected, including a complete blood count, coagulation profile and serum biochemical test (renal and liver function, electrolytes sedimentation rate (ESR), C-reactive protein (CRP), lactate dehydrogenase (LDH), D-dimer, ferritin-1 (FER-1), Interleukin (IL) and etc.). Two researchers independently reviewed the data collection to double check the data collected.
2.2 Pulmonary CT Scans
CT scans were performed using 64-row multi-detector scanners (Siemens Definition AS+, Siemens Healthcare) with the following parameters: 120kVp; effective mAs, 155mAs; Collimator, 0.6*128; pitch, 1.2; Rotation Time, 0.5s; kernel, B60f; Matrix, 512*512; Slice Thickness, 1.0mm.
2.3 Deep Learning Model
Quantification parameters were extracted from chest CT images by a combination of the traditional image process algorithm and deep learning (DL) network.
The deep learning network was developed for lung lesion detection and segmentation, which was designed as a combination of U-net and Fully convolutional network [8, 9] (Supplementary Fig.1). Like the U-net structure, this DL network has a contracting path (Supplementary Fig.1) and an expansive path (downside in Supplementary Fig.1). This DL network consists of three different components: 1) convolutional segment, which includes convolutional layer, batch normalization layer and an activation layer; 2) max pooling layer; 3) transpose convolutional layer. The feature map was first extracted from input CT images and passed through convolutional segments. Max-pooling layer and transpose convolutional layer were used for up-sampling and down-sampling. In addition, concatenation operations were performed between convolutional segments as bridges of contracting and expansive paths.
CT images from 730 COVID-19 positive patients and 1013 chest normal patients were retrospectively collected as the training dataset with totally over 200000 CT slices. The test dataset included images from 212 COVID-19 patients and 327 chest normal subjects. The ground truth region of interest (GT-ROI) for lung lesion was first drawn by an experienced radiologist (~5-year experience) and then reviewed by a senior radiologist (~10-year experience), who was responsible to modify ROIs if not accepted. After training, the Dice coefficient was used to evaluate the performance of this in-house built network.
The Dice accuracy was defined as:
PR-ROI is the predicted ROI by DL network and the GT-ROI is the ROI drawn by radiologists.
2.4 Image Quantification
After lung lesions were detected by the DL network and confirmed or adjusted by two radiologists, quantification parameters related with lung lesion could be calculated, including lesion, GGO and consolidation volume. The separation of GGO and consolidation within lesion was determined by a CT value threshold [10]. Furthermore, the whole lesion rate (that is, abnormal rate) and GGO, consolidation rates were defined as: corresponding volume / whole lung volume *100%. The total CT score was the sum of the lung involvements (5 lobes, score 0-4 for each lobe, ranges, 0-20) .
The lung was also segmented by the adaptive thresholding and morphological operation, and the CT value distributions of the whole lung could be calculated based on the segmentation as well as the mean, median and peak values of the CT distribution[11]. Hellinger distance (hereinafter, hellinger) and intersection over union (IOU) of lung CT distribution were calculated between chest CT images of pneumonia patients and normal people[12, 13].
2.5 Statistical Analysis
Statistical analyses were performed using the GraphPad Prism 8.0 (GraphPad software) and IBM SPSS statistical software version 22 (IBM corp.). Categorical variables and counting data were summarized as frequencies and proportions, and their differences were analysed by the Chi-square test. Quantitative normally distributed data were presented as mean ± standard deviation, while the non-normally distributed data were expressed as the median and quartiles unless specified. The independent-samples t-test or Wilcoxon rank tests were used to compare the differences in groups. Receiver operating characteristic (ROC) analyses were used to evaluate the diagnostic efficiency. The Spearman rank correlation or Pearson tests were used to detect the correlations between the laboratory findings and CT parameters in non-normally or normally distributed data, respectively. P < 0.05 was considered statistically significant.