Data sources and Patient Demographics. In this study, our dataset was obtained 191 abdominal CT data from 191 adults (mean age, 31 years ± 11 years; 58% male; mean weight, 65.9 ± 11.4 kg) of liver transplant donors from 2005 to 2017 at Gachon Gil Medical Center. This study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Review Board of the Gil Medical Center (IRB No. GBIRB2021-229), and written informed consent was obtained from all the participants. All images were deidentified before inclusion in this study. The demographics and other characteristics of each cohort are summarized in Table 1. The imaging system is summarized in Table 2.
Datasets & Data Pre-processing. In this study, we set the Hounsfield units (HU) window in the range from − 130 to 230 HU23. In such a window, irrelevant organs were mostly removed. Figure 1 shows the result of the window setting. Furthermore, CT data had the same 512 × 512 in-plane resolutions, but due to the computational limitation of the graphics card, the image was resized to a resolution of 256 × 256. Finally, the image and ground truth data had a shape of 64 × 256 × 256 × 1 and were divided into train: validation: test sets in the ratio of 70: 15: 15.
Liver Segmentation Using DALU-Net. The proposed model Deep Attention LSTM U-Net (DALU-Net) had an architecture similar to the standard U-Net, consisting of an encoder and a decoder10. The encoder could extract more complex hierarchical features and obtain contextual information. The decoder deconvolutes the features extracted by the encoder to reconstruct the size of the volume reduced by the convolution operation. In addition, it concatenates hierarchical features using the skip connection at every level of encoder and the decoder. Information about localization lost due to convolution and pooling layers in the encoder, can be corrected, and the network can segment objects more accurately. Figure 2 illustrated the DALU-Net architecture.
DALU-Net is a model that combines AM, DS, and CLSTM techniques. AM uses a module called attention gate (AG) to skip connections between the up-sampling layer and encoder. The CLSTM was used in the feature map from AG and was used only in the decoder. DS was used for fast convergence of the model, and the loss is calculated at every level of the decoder, and the final loss was calculated as the sum of each loss.
AM is commonly used for machine translation24 and classification in natural language processing and graph neural networks25,26. Recently, AM has been variously used in various semantic segmentation and classification tasks in images27. AM has demonstrated improved accuracy in the medical image11,13,28,29. In the image segmentation and classification task, the AM was designed to generate an attention map by analyzing the gradient of the output class score for the input image, reduce the weight of the background by multiplying it with the input image, and focus on the object. The de-tails of the AM are shown in Fig. 3.
DS supervises the CNN's hidden layer in the segmentation task, to speed up convergence and resolve the gradient-vanishing problem12,17,30. The red arrow in Fig. 2, the loss is calculated using a sigmoid function for every level of the decoder, and the final loss was determined by summing each loss. It has the advantage of allowing the final loss to converge rapidly by optimizing at each level. In this study, DS was applied to the output values from the attention gate of each level.
Liver Volumetry. The segmented liver area was calculated using the proposed model on the axial image. The liver volume was calculated for each image by multiplying the liver area and the section thickness. Thus, the whole liver volume was estimated by adding the liver volume of all images.
Ground Truth for Liver Segmentation and Volumetry. The liver was manually labeled on all CT images under the supervision of a liver transplant surgeon with more than 5 years of experience in CT analysis related to liver transplantation. ImageJ software was used for manual segmentation (NIH, Bethesda, MD, USA). For liver volumetry, the calculated liver volume, based on manual segmentation, was used as the reference standard (as described in Liver Volumetry section).
Ground Truth for Left & Right & Caudate lobe. In this study, a comparative analysis was performed for segmentation and volume measurements for the left, right, and caudate lobe regions, according to the anatomical structure of the liver31,32. The left lobe was defined as the upper region, above the middle hepatic vein, and the lower region was defined as the right lobe. The caudate lobe was located to the left of the inferior vena cava (IVC), without overlapping the left lobe in the coronal view31,32. Figure 4 schematically illustrates the ground truth for the left lobe, right lobe, and caudate lobe.
Implementation Details. The DALU-Net was implemented in Python 3.7, TensorFlow 2.1.3, Keras 2.3.1, and was run on four NVIDIA Tesla V100 GPU with 5120 cores and 32GB of memory. Our networks were trained using the Adam optimizer to minimize the dice loss33,34. When the loss was not minimized over 10 epochs, the learning rate was reduced by multiplying the initial learning rate by 0.1, with an initial value of 0.01. We terminated the process early when the loss did not improve for 30 epochs in the training procedure. Our network terminated the training procedure early in 90–150 epochs.
Evaluation Metrics. We evaluated the performance of the proposed approach. The evaluation metrics included the dice similarity coefficient (DSC), intersection over union (IOU), and Hausdorff distance35. The DSC was defined as the volume of overlap between the CNN and manual labeling segmentations divided by the average of the segmentation volume of the two methods. The IOU was defined as a mathematical indicator of how much the two objects’ positions coincide. The Hausdorff distance was defined as the difference measured between two subsets of metric space.