Subjects
For the training and validation data of the automatic brain extractor, five hundred whole body [18F]FDG-PET scans were retrospectively collected. These PET scans were performed from June to July 2020 in a single center (Age = 66.7 ± 3.4, M : F = 194 : 306). The scans which were explicitly prescribed to include the brain (by the oncologic clinicians) were excluded from analysis, to remove the bias in the evaluation of detecting accuracy of brain existence. Among the 500 cases, the primary site of malignancy was breast (19.8%), lung (18.6%), hematologic (14.2%), colorectal (9.4%), biliary (6.8%), ovary (5.4%), pancreas (5.2%), liver (5.0%), stomach (4.2%), thymus (3.6%), urinary tract (3.4%), soft tissue (3.0%), thyroid (0.6%) or unknown (0.8%).
For the quantitative assessment of the extracted brain as an independent test, FDG PET images acquired from small-cell lung cancer (SCLC) patients were retrospectively collected. The scans were acquired from January 2014 to December 2017 in the same institute. To test whether our automated brain analysis pipeline identify brain metastasis in SCLC patients, groups were defined according to the presence of brain metastasis. Four patients had brain metastasis confirmed by brain MRI at baseline and follow-up (age: 66.8 ± 6.5, M : F = 4 : 0). Twenty PET scans without brain metastasis according to the baseline brain MRI were regarded as controls (age: 71.2 ± 6.1; M : F = 17 :3).
Image acquisition
As a routine protocol of FDG PET, after fasting more than 4 hours, patients were intravenously injected with 5.18 MBq/kg of FDG. After 1 hour, PET image was acquired from the skull base to the proximal thigh using dedicated PET/CT scanners (Biograph mCT 40 or mCT 64, Siemens, Erlangen, Germany) for 1 minute per bed. A Gaussian filter (FWHM 5 mm) was applied to reduce noise, and images were reconstructed using an ordered-subset expectation maximization algorithm (2 iterations and 21 subsets).
Deep learning model and training data for the brain extraction
We devised an automatic brain extractor, based on the following two objectives: 1) the evaluation of whether a scan included the entire brain and 2) establishment of 3-dimensional bounding box which included brain volume. To achieve these goals, we implemented a CNN-based deep-learning model, according to the following procedure. The brief outline of the study is shown in (Fig. 1).
For training of the model, the maximum intensity projection (MIP) image for each of the PET scan was generated. For each of the 500 MIP image, 2-dimensional bounding boxes were manually drawn on the anterior and lateral views of the MIP image. Coordinates from the two bounding boxes were merged to obtain coordinates of 3-dimensional bounding box for each PET image. Images that did not contain full range of brain was classified elsewhere, as “not containing entire brain”.
Two MIP images, anterior and lateral views, were changed to square matrices by zero-padding. The matrices were changed to 224 x 224 using bilinear interpolation. The pixel values represented standardized uptake value (SUV). To be inputs of a CNN model, pixel values were divided by 30, as most voxel values of PET volume has less than SUV 30 except urine, and then multiplied by 255 to have a range approximately 0 to 255.
We utilized ResNet-50 [17, 18] for the learning model, a convolutional neural network pre-trained with images from ImageNet database [19]. ResNet-50 was implemented for preprocessing of the input data and predicting the coordinates of 3-D bounding boxes from the MIP images. The pre-trained ResNet-50 respectively extracted feature vectors from the two views of MIP images. The extracted features were concatenated. Additional fully-connected layer with 4096 dimensions was connected to the concatenated feature vectors and then finally connected to different outputs. An output represented coordinates of bounding box of the brain consisting of 6-dimensional vectors (coordinates for 3 axes and width, length, and depth of the bounding box for 3 axes). Another output with one-dimensional vector that represented whether a given PET volume included the entire brain. Image augmentation was applied to the training dataset. MIP images were randomly augmented by multiplying voxel values, changing contrast, scaling and translating images.
We performed the internal validation by randomly selected 10% of the data as a validation set. The loss function was defined by two terms:
1) binary cross entropy of an output that represented whether a given PET volume included the entire brain and 2) mean squared error estimated by the 6-dimensional vector representing coordinates of the bounding box (Fig. 1). The weight for the loss was changed according to the training process: we set to alpha = 10 and beta = 1 for the sum of loss function. We measured intersection over union (IOU) for the predicted and labeled bounding boxes. From the predicted coordinates of bounding boxes, we extracted brain images from whole body PET and spatially normalized them to the template space, as mentioned later.
Processing of the extracted brain
The trained model was applied to whole body PET images to extract brain if the model predicted that the image contains the whole brain volume. FDG PET volumes were resliced to have a voxel size of 2 x 2 x 2 mm3. We segmented the brain with the coordinates of the 3-D bounding boxes predicted by the model. Padding of 10 voxels is applied for each axis to determine the brain volume. The extracted brain volumes were spatially normalized onto Montreal Neurological institute (McGill University, Montreal, Quebec, Canada) standard templates. The spatial normalization was performed by symmetric normalization (SyN) with the cross-correlation loss function implanted in DIPY package [20]. More specifically, a given extracted brain volume was linearly transformed to the template PET image with affine transform. The warping was performed by the symmetric diffeomorphic registration algorithm. The spatially normalized PET volume was saved for further quantitative imaging analysis.
Quantitative analysis of the extracted brain
The extracted and spatially normalized brain volume was analyzed by a quantitative software, SPM12 (Institute of Neurology, University College of London, London, U. K.) implemented in Matlab 2019b (The MathWorks, Inc., Natick, MA, U. S.). The normalized brain images were smoothed by convolution with an isotropic gaussian kernel having a 10 mm full width at half maximum to increase the signal-to-noise ratio.
For the twenty-four subjects with SCLC, we performed the voxelwise two-sample T-test for each of the normalized brain volume from the four scans with metastatic lesions, with the whole images from the twenty control group subjects. Uncorrected P < 0.001 was applied to identify patient-wise metabolically abnormal regions.
For each of the four comparisons, we also constructed a map of T-statistics and extracted the peak T values. As a proof-of-concept study, we investigated if the statistical analysis successfully revealed the metastatic lesions confirmed by the brain MRI previously.