A Novel Deep Learning Approach for Cervical Vertebral Maturation Classification

doi:10.21203/rs.3.rs-5026586/v1

Download PDF

Article

A Novel Deep Learning Approach for Cervical Vertebral Maturation Classification

https://doi.org/10.21203/rs.3.rs-5026586/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Objectives

This study aims to automatically determine the cervical vertebral maturation staging (CVM) on lateral cephalometric radiograph images using a customized deep convolutional neural network (DCNN) model and to evaluate the classification performance using a custom DCNN model in detecting CVM stages.

Methods

A dataset of 922 digital lateral cephalometric radiographs from individuals aged 7–20 years was collected. Image quality was assessed for artifacts and clarity of C2-C4 vertebrae. CVM staging was independently performed by two orthodontists, with inter-observer reliability assessed using kappa coefficient. Image pre-processing involved random oversampling to address class imbalance and resizing to 128x128 pixels. A custom convolutional neural network was developed, with hyperparameters optimized using random search. The final architecture comprised convolutional layers, global average pooling, dense layers, and dropout. The model was trained for 50 epochs using Adam optimizer and categorical cross-entropy loss. Performance evaluation included accuracy, loss, and confusion matrix analysis on a validation set.

Results

A novel convolutional neural network was developed for the classification of CVM staging. This custom model initially exhibited overfitting, achieving perfect training accuracy but only 57% validation accuracy due to class imbalance. Implementing Random Oversampling (ROS) addressed this issue by balancing the dataset. Hyperparameter tuning optimized the model architecture, resulting in a final validation accuracy of 85.96%. The model demonstrated strong performance in classifying CVMS 1, 2, and 3, with precision and recall exceeding 95%. However, classification of CVMS 4 and 5 posed challenges, with lower precision and recall values. Overall accuracy reached 88.2%, indicating a generally robust model, though further improvements are necessary for CVMS 5.

Conclusion

This study successfully developed a custom deep convolutional neural network for automated cervical vertebral maturation (CVM) staging on lateral cephalometric radiographs. By addressing class imbalance and optimizing hyperparameters, the model achieved a validation accuracy of 88.2%. While demonstrating potential for clinical application, the model’s performance varied across CVM stages, indicating a need for further refinement to improve accuracy and robustness.

Biological sciences/Computational biology and bioinformatics/Classification and taxonomy

Biological sciences/Computational biology and bioinformatics/Machine learning

Accurate assessment of skeletal maturity is crucial for successful orthodontic treatment(1, 2). Precisely determining the timing of accelerated growth and skeletal development is essential for optimizing treatment outcomes and minimizing the need for complex surgical interventions(3–5). Beyond orthodontics, bone age assessment is valuable for paediatric and forensic medicine such as determining developmental stage, final height, and legal age in cases where identification is missing(6, 7). This information aids in diagnosing growth disorders, planning treatments, and forensic investigations(8).

Radiographic analysis is commonly employed to assess skeletal maturation, pubertal development, and growth potential(9, 10). Traditionally, hand-wrist radiographs have served as the gold standard for evaluating skeletal age, offering a standardized method for comparison(6, 11, 12). However, this technique necessitates specialized interpretation skills and exposes patients to ionizing radiation(10, 11, 13, 14). Alternatively, cervical vertebral maturation (CVM) evaluates skeletal maturity by analyzing morphological changes in cervical vertebrae on lateral cephalometric radiographs, a routine orthodontic diagnostic image(15). Though CVM correlates well with hand-wrist assessments and avoids additional radiation, its application can be complex and time-consuming(16). Despite these challenges, CVM offers a potential advantage in orthodontic diagnosis by providing a non-invasive method for evaluating skeletal growth and development(17).

The modified Baccetti’s CVM classification provides a framework for assessing skeletal maturity based on observable changes in cervical vertebral morphology(18). The six-stage cervical vertebral maturation (CVM) model delineates the morphological changes of C2, C3, and C4 vertebrae throughout adolescence. Stage 1 (Initiation) is characterized by trapezoidal vertebral bodies with flat inferior borders. As growth accelerates (Stage 2), concavity develops on the inferior borders of C2 and C3, with vertebral bodies transitioning to rectangular shapes. Stage 3 (Transition) marks a period of rapid morphological change, with increasing concavity and persistent rectangular shapes. Growth deceleration (Stage 4) is associated with pronounced concavity and the emergence of square-shaped vertebral bodies. Stage 5 (Maturation) signifies skeletal maturity with maximal concavity and square vertebral bodies. Finally, Stage 6 (Completion) represents the cessation of growth, characterized by deepened concavity and potentially vertically elongated vertebral bodies(19, 20).

Continued research is imperative to refine CVM methodology, establish standardized assessment criteria, and integrate it with other diagnostic tools(21, 22). Artificial intelligence (AI), specifically machine learning (ML), has revolutionized medical image analysis. Deep learning (DL), a subset of ML, employs multi-layered neural networks to learn complex patterns directly from data(23, 24). Convolutional Neural Networks (CNNs), a type of deep learning architecture, have excelled in image classification tasks, including medical imaging applications like convolution computation, and backpropagation algorithm which can improve disease diagnosis and forensic analysis(25).

Early studies employed traditional ML algorithms to classify CVM stages(26). Most of these studies have compared semiautomated systems to identify landmarks and analyse the CVM stages(13, 27, 28). However, recent investigations have increasingly adopted DL techniques, particularly CNNs, due to their superior performance in image analysis(29). While some studies have compared different CNN architectures for CVM classification, most have utilized existing models like ResNet and Inception, with limited exploration of newer models(15). Newer advances in the AI field have explored fully automated systems to eliminate the human landmark identification which may be prone to internal and external errors(30). These studies either used a very complex deep architectural structure or did not develop a new model. Nogueira et al (15) recently compared 4 different CNN models (AlexNet,16 VGG16,17 ResNet18,18 and Inception-v3.19) and Shoari et al (30) compared a custom model to ResNet 18 for CVM analysis. Both studies found the model's performance could be enhanced through more extensive data augmentation to improve robustness. The subtle differences between CVM stages, coupled with low image quality and imbalanced dataset distribution, posed challenges. Incorporating additional preprocessing layers, expanding the patient age range, and utilizing expert-labeled data could further optimize the model. Additionally, exploring different CNN architectures and hyperparameter optimization techniques may yield improved results.

This study aims to address this gap by proposing a new CNN model for automated CVM stage classification on lateral cephalometric radiographs and evaluate its performance rate in detecting CVM processes.

A sum of 922 digital lateral cephalometric radiographs were acquired from individuals aged between 7 and 20 years for the purpose of pre-orthodontic assessment and treatment planning. Archived radiographs were obtained for research purposes between 1st October 2023 until 1st April 2024 from Radiology Unit, Faculty of Dentistry, Universiti Teknologi MARA (UiTM). Ethical approval was obtained by the UiTM Research Ethics Committee reference number REC/06/2023 (PG/MR/205) which waived the requirement for informed consent due to the retrospective nature of the study. Data management and analysis were conducted in accordance with the principles outlined in the ICH Good Clinical Practice Guidelines, Malaysian Good Clinical Practice Guidelines, and the Declaration of Helsinki. At no point during or after data collection did the authors have access to information that could identify individual participants.

The chronological age of the subjects was determined by subtracting their birth date from the date the radiographs were captured. Only radiograph images devoid of artifacts and distortions, and with clear visibility of the C2, C3, and C4 vertebrae, were considered for inclusion in the investigation. All lateral cephalometric radiographs used in the research were obtained using the x-ray unit with a standardized protocol (73 kVp, 15 mA, and 14.9 s exposure time) adhering to the manufacturer's guidelines for positioning and irradiation. The analysis of the images was conducted on a 24-inch medical display monitor (Philips, Luchu Hsiang, Taiwan) equipped with an NVDIA QUADROFX 380 graphics card to ensure an optimal visual representation.

The evaluation of CVM was independently conducted by two orthodontists (NHN and NA) with more than 12 years research and orthodontic clinical experience, respectively), and the inter-observer agreement was assessed using the kappa coefficient. In cases of image labeling discrepancies among observers, a consensus was reached through re-examination for a final decision.

The methodology for building, training, and evaluating the custom cervical vertebral maturation stage classification model is detailed as follows (31):

Data Preprocessing

The dataset used in this study consists of images classified into six distinct classes. To prepare the data for training and validation, we employed random oversampling (ROS), a technique designed to address class imbalance by increasing the number of samples in the minority classes. Class imbalance, where some classes have significantly more samples than others, can lead to biased model training. This bias can cause the model to become overly sensitive to the majority class and less effective at recognizing the minority class. As image resizing is a crucial preprocessing step that ensures the dataset is in an optimal format for training and validation, all images were resized to 128x128 pixels and processed in batches of 32 images. This will lead to better model performance and more efficient use of computational resources(31).

Custom Model Building with ROS

Before finalizing the custom model architecture, we performed hyperparameter tuning using random search to identify the optimal configuration for our image classification task. Hyperparameter tuning involves experimenting with different values for various model parameters to enhance performance and achieve better results. The model was compiled using the Adam optimizer, known for its adaptive learning rate capabilities, making it well-suited for image classification tasks. The loss function used was categorical cross-entropy, appropriate for multi-class classification, and accuracy was chosen as the evaluation metric(32).

Hyperparameter Tuning with Random Search:

Random search is a technique used to sample hyperparameters randomly from a predefined range of values. This approach is often more efficient than grid search, as it explores a wider range of values and has a higher probability of finding a near-optimal set of hyperparameters. The hyperparameter tuning process focused on several key parameters: learning rate, batch size, number of units in the dense layer, and dropout rate. The learning rate, which determines the step size for updating model weights, was varied between 0.0001 and 0.1. The batch size, indicating the number of training samples processed before updating the model’s parameters, ranged from 16 to 128. The number of units in the dense layer, specifying the number of neurons, was tested with values between 64 and 512. The dropout rate, used to prevent overfitting by dropping a fraction of input units, varied from 0.2 to 0.5.

A search space was defined for each hyperparameter, and random sampling was used to explore different combinations. Each combination was employed to train the model, and its performance was evaluated on a validation set. After assessing the performance metrics, the combination of hyperparameters that produced the best results was selected(33).

Final Model Architecture

Based on the results of the hyperparameter search, the final model architecture is designed as follows: The model starts with an input layer that processes images of size 128x128 with three color channels (RGB). It is followed by a first convolutional layer with 96 filters and a kernel size of 3, which extracts the initial feature maps from the images. A second convolutional layer, optimized with 112 filters and a kernel size of 3, further refines these features.

Following the convolutional layers, a global average pooling layer reduces the spatial dimensions of the feature maps, aggregating information while minimizing the number of parameters. This is succeeded by a fully connected dense layer with 112 units and ReLU activation, which introduces non-linearity and enhances the model's ability to learn complex patterns. To combat overfitting, a dropout layer with a rate of 0.2 is included, randomly dropping a fraction of input units during training.

The model concludes with an output layer consisting of six neurons, each representing one of the six classes, and utilizes a softmax activation function to generate a probability distribution across the classes. This architecture incorporates the optimal hyperparameters identified through the search, aiming to achieve effective feature extraction, robust learning, and accurate classification(34).

Training the Custom Model

The custom model was trained on the training dataset using the fit method, which performs backpropagation to iteratively update the model's weights over multiple epochs. For this study, the model was trained for 50 epochs. During each epoch, the training dataset was divided into mini-batches, and the model was exposed to these mini-batches sequentially, allowing it to learn and update weights gradually.

After each epoch, the model's performance was evaluated on a separate validation dataset. This evaluation helps monitor for overfitting and ensures that the model's performance generalizes well to unseen data. Performance metrics such as training and validation accuracy and loss for each epoch were recorded, providing insights into the model's learning progress and effectiveness.

This recorded training history was used for detailed analysis and to make any necessary adjustments to improve the model's performance. The validation dataset results guided decisions to refine the model and ensure that it was learning effectively without overfitting(35).

Evaluation

After training, the model's performance was evaluated on the validation dataset to ensure it generalized well to unseen data. Evaluation metrics included loss and accuracy, with validation accuracy indicating how well the model performed on data not used during training.

To gain a deeper understanding of the model’s performance across different classes, the confusion matrix was computed. This matrix provides detailed insights into the number of true positive, false positive, false negative, and true negative predictions for each class, revealing any specific classes where the model might be struggling. The predicted labels for the validation set were obtained using the predict method, and these predictions were compared with the true labels to construct the confusion matrix(36).

Statistical analyses were performed using IBM SPSS Statistics 23.0. Inter-rater reliability was assessed with the kappa coefficient, with values interpreted as follows: slight (0.01–0.20), fair (0.21–0.40), moderate (0.41–0.60), substantial (0.61–0.80), and almost perfect (0.81–0.99) agreement.

The kappa coefficient of 0.87 indicates an almost perfect level of agreement between the two observers in determining CVM stages, suggesting high reliability and consistency in the data. Table 1 presents the demographic breakdown of the patient population, along with the distribution of CVM stages identified through visual assessment.

Table 1

Descriptive statistics of the patient’s age and cervical vertebrae stage
Cervical vertebrae maturation stages	Mean age (Years) \(\:\pm\:\)SD	N (%)
CVM Stage 1 CVM Stage 2 CVM Stage 3 CVM Stage 4 CVM Stage 5 CVM Stage 6 Total	7.59 \(\:\pm\:\) 1.64 9.75 \(\:\pm\:\:\)1.53 10.61 \(\:\pm\:\:\)1.24 12.16 \(\:\pm\:\:\)1.31 13.62 \(\:\pm\:\:\)1.35 17.21 \(\:\pm\:\:\)3.04 11.82 \(\:\pm\:\:\)3.35	32 (3.47%) 47 (5.10%) 97 (10.52%) 196 (21.26%) 296 (32.10%) 254 (27.55%) 922 (100%)

Initially, the custom model was developed without implementing Random Oversampling (ROS). During training, the model achieved a perfect accuracy of 100%. However, this high accuracy did not translate to validation performance, which was only 57%. This discrepancy suggests that while the model fit the training data exceptionally well, it struggled with generalization, likely due to class imbalance or insufficient representation of minority classes. Figure 1 shows learning curves of training and validation accuracy without applying the ROS.

Figure 1 Learning Curves of Training and Validation Accuracy without ROS

To address the class imbalance, random oversampling (ROS) was applied. Initially, the dataset had varying numbers of images per class, from 32 images for class CVMS 1 to 296 images for class CVMS 5. ROS increased the number of images in each class to 296, resulting in a final dataset of 1,420 images for training and 356 images for validation. Meanwhile, Fig. 2 illustrates the confusion matrix for the custom model before applying ROS. Without ROS, there were significant misclassifications, particularly among classes CVMS 2 through CVMS 6.

Figure 2 Confusion matrix for the custom model without ROS

The hyperparameter search has been completed, yielding the optimal configurations for the model. Figure 3 presents the optimal hyperparameters identified through the search. For the first convolutional layer, the optimal number of filters was determined to be 96, with a kernel size of 3. This configuration was selected to effectively capture initial features from the input images. The second convolutional layer was optimized with 112 filters and the same kernel size of 3, enhancing the model's ability to refine and extract more complex features.

In the fully connected dense layer, the optimal number of units was found to be 112. This setting helps the model learn intricate patterns and relationships in the data. To mitigate overfitting, a dropout rate of 0.2 was identified as optimal, providing a balance between regularization and model capacity. These hyperparameters were chosen to maximize the model's performance and efficiency in handling the image classification task, ensuring robust feature extraction and effective learning.

Figure 3 Detailed layer configuration of the custom model

During the training of the model over 50 epochs, significant improvements in performance were observed. Initially, the model achieved a loss of 1.7053 and an accuracy of 33.24% on the training set, with a validation accuracy of 50.56%. By the fifth epoch, training accuracy had increased to 85.92%, with validation accuracy rising to 79.78%. As training progressed, accuracy continued to improve, reaching 97.39% by epoch 10, although validation accuracy experienced fluctuations, peaking at 87.36% in epoch 15. Afterward, the model's performance showed some variability, with validation accuracy ranging between 82.30% and 87.36%. Despite this variability, the final validation accuracy reached 85.96% by epoch 50, indicating a robust model with good generalization capabilities. Figure 4 shows the learning curve for classifying CVMS, illustrating the model's performance.

Figure 4 Learning Curves of Training and Validation Accuracy

The results presented reflect the model's performance after testing on the validation dataset. Table 2 illustrates the classification report obtained after the model evaluation. For CVMS 1, the model achieved perfect accuracy and recall, indicating flawless identification of this class with very few false positives or false negatives. The CVMS 2 also performed exceptionally well, with a high precision of 95.8% and a recall of 98.6%, suggesting strong predictive capability and reliable classification. The CVMS 3 demonstrated balanced performance with a precision and recall of 96.0%, reflecting an effective and consistent ability to identify this class.

However, CVMS 4 showed slightly lower precision at 85.7% and recall at 87.5%, pointing to some challenges in minimizing false positives and negatives. The performance of CVMS 5 was notably weaker, with a lower precision of 77.0% and recall of 71.2%, indicating difficulties in accurately classifying this class, which may be due to complex feature differentiation. The CVMS 6 had moderate performance with precision and recall around 79.0% and 80.0%, respectively, suggesting reasonable effectiveness but room for improvement.

The overall accuracy of the model is 88.2%, reflecting a generally strong performance. The macro and weighted averages further confirm that the model is performing well across different classes, with the macro average indicating balanced performance across all classes and the weighted average accounting for class imbalances. The classification report highlights that while the model excels in several classes, attention should be given to improving the classification of CVMS 5 to enhance overall robustness. The confusion matrix in Fig. 5 reveals varied performance across classes.

Table 2

The precision, recall and F1-score values calculated according to the confusion matrix
Stage	Precision	Recall	F1- score
1 2 3 4 5 6 Accuracy	0.98 0.96 0.96 0.85 0.77 0.79 0.88	1.00 0.99 0.96 0.88 0.71 0.80 0.88	0.99 0.97 0.96 0.87 0.74 0.79 0.88

Figure 5. Confusion Matrix

Figure 6. Summary of custom DCNN model and its performance

Resizing the training and validation images is essential for several reasons. Firstly, machine learning models, particularly convolutional neural networks (CNNs), require input images of the same size for the architecture to function correctly. Resizing ensures uniform dimensions, leading to more consistent and reliable training outcomes. Additionally, smaller, uniformly sized images reduce computational load and memory usage, making the training process faster and more efficient. Using a standard size like 128x128 ensures uniformity across the dataset, which is crucial for model training. It avoids the introduction of bias that could occur if images of varying sizes were used. Besides, Smaller image dimensions reduce the computational load and memory usage during training. By resizing to 128x128, the model can process images more quickly, which is particularly important when working with large datasets or complex models.

Standardizing image sizes also prevents inconsistencies that could affect the model's learning process. For CNNs, resizing ensures the model can consistently extract relevant features across all images. Furthermore, resizing to the specific size used by pretrained models (e.g., 224x224) ensures compatibility, allowing for effective transfer learning and leveraging pretrained weights. Overall, resizing is a crucial preprocessing step that optimizes the dataset for better model performance and efficient use of resources.

Developing a network from scratch without annotations offers flexibility, deep understanding, and optimization opportunities, enabling tailored architectures for specific tasks. It fosters skills enhancement, allows for unsupervised or self-supervised learning, encourages innovative approaches, and provides full control over the data processing and training pipeline, making it ideal for research and experimentation despite requiring significant effort and expertise. Radwan et al(34) proposed an automated unsupervised learning to explore other layers of CNN but results have been hindered by a cervical vertebrae annotation tool to trace the CVM stages resulting in poor CVM 4 and CVM 5 staging.

ROS aims to balance the class distribution by generating synthetic samples for minority classes, providing a more representative dataset. This adjustment is intended to improve the model's generalization capabilities and enhance validation performance, thereby reducing the gap between training and validation accuracy and leading to more reliable classification results across all classes. When performing ROS, the classification performance on the majority class can sometimes be quite low due to several factors. Firstly, random oversampling can lead to overfitting on the minority class as the algorithm might memorize the duplicated instances rather than learning generalizable patterns, resulting in poor performance on unseen data, especially affecting the majority class. Secondly, although oversampling balances the class distribution, it doesn't create new information, and the oversampled data might not represent the underlying distribution of the classes well, causing difficulty in generalization. Additionally, oversampling can amplify noise present in the minority class; duplicating noisy instances makes the model more prone to misclassify the majority class. Furthermore, by balancing the classes, the model might shift its focus towards the minority class, leading to better performance for the minority class but potentially lowering the performance for the majority class.

Initially, the model exhibited high training accuracy of 100%, but validation accuracy was substantially lower at 57%, indicating issues with class imbalance. To address this, ROS was implemented. This technique increased the number of images for each class to balance the dataset, enhancing its representation and improving model training. As a result, the final dataset consisted of 1,420 images for training and 356 for validation. The application of ROS led to a notable improvement in the model’s performance metrics, addressing the previously observed misclassifications, particularly in classes CVMS 2 through CVMS 6. The confusion matrix and performance metrics post-ROS demonstrated a more balanced and effective classification across all classes, reducing the number of misclassifications and improving overall model robustness.

Li et al. (35) had an impressive data collection of 10,200 radiographs, which was 9 times more than any other study conducted. They utilized YOLOv3 as the core operation for detecting regions of interest. However, they struggled with an overall performance accuracy of 70%, particularly having difficulty identifying specific CVM stages like CVS2 and CVS3. It was suggested to incorporate additional factors such as intervertebral disc space and dental age. In contrast, our custom network with ROS implementation achieved superior accuracy, with an overall testing performance of 88%. However, our customized model had slight difficulty accurately identifying CVM stages 5 and 6. This difficulty arises because these stages are the majority classes in the dataset, and the use of ROS, which balances the dataset by oversampling the minority classes, might not effectively address the inherent complexity or characteristics of the majority classes. This could lead to a model that is better at identifying minority classes but less effective at distinguishing between the more frequent majority classes.

The classification metrics provide a detailed overview of the model's performance across different classes. For CVMS 1, the model achieved perfect accuracy and recall, indicating flawless identification of this class with very few false positives or false negatives. CVMS 2 also performed exceptionally well, with a high precision of 95.8% and a recall of 98.6%, suggesting strong predictive capability and reliable classification. The CVMS 3 demonstrated balanced performance with a precision and recall of 96.0%, reflecting an effective and consistent ability to identify this class.

Misclassification significantly occurred in CVMS 6 due to the large age gap among the subjects. This age disparity led to high morphological variation within the dataset. As individuals grow, their morphological features can change considerably, especially in developmental stages covered by CVMS 6. These variations make it difficult for classification algorithms to categorize the subjects consistently and accurately. The high degree of morphological diversity introduces complexity in pattern recognition and model training, resulting in inconsistent classification results.

To mitigate these issues, alternative techniques or combinations of techniques can be considered. Methods like SMOTE (Synthetic Minority Over-sampling Technique) generate synthetic instances rather than duplicating existing ones, helping to create more diverse samples for the minority class. ADASYN (Adaptive Synthetic Sampling), a variant of SMOTE, focuses more on generating synthetic samples for minority class instances that are harder to classify. Ensemble methods, which combine the results of multiple models, can often balance the bias towards any particular class.

This study demonstrates the feasibility of a novel customized convolutional neural network for automated CVM staging on lateral cephalometric radiographs. While the model achieved promising results, further improvements are necessary, particularly in distinguishing between certain CVM stages. Future research should explore more complex network architectures, such as those incorporating additional layers, skip connections, or different pooling methods. Additionally, employing advanced hyperparameter optimization techniques like Bayesian Optimization or Genetic Algorithms can potentially enhance model performance.

Authors Contributions Statement

MYPMY conceptualized the study, designed the methodology, and supervised the research. NHN and NM contributed to data collection and analysis. NM implemented the deep learning models and performed the statistical analysis. NHN and MMR was involved in writing the initial draft and revising the manuscript. All authors contributed to manuscript review and editing, and have approved the final version for submission.

Author Contribution

Data Availability

Data Availability Statement:The data supporting the findings of this study are not publicly available due to privacy concerns and ethical restrictions. However, data can be made available to the editor and reviewers upon request.

Hagg, U. & Pancherz, H. Dentofacial orthopaedics in relation to chronological age, growth period and skeletal development. An analysis of 72 male patients with Class II division 1 malocclusion treated with the Herbst appliance. Eur. J. Orthod. 10 (3), 169–176 (1988).
Gray, S., Bennani, H., Kieser, J. A. & Farella, M. Morphometric analysis of cervical vertebrae in relation to mandibular growth. Am. J. Orthod. Dentofac. Orthop. 149 (1), 92–98 (2016).
Abdulla Alkhal, H., Wong, K. & Bakr Rabie, R. W. AM. Correlation between Chronological Age, Cervical Vertebral Maturation and Fishman’s Skeletal Maturity Indicators in Southern Chinese. Angle Orthod. ;78. (2008).
Perinetti, G. & Contardo, L. Reliability of Growth Indicators and Efficiency of Functional Treatment for Skeletal Class II Malocclusion: Current Evidence and Controversies. Vol. BioMed Research International. Hindawi Limited; 2017. (2017).
Montasser, M. A. Craniofacial growth spurt in Class I subjects. Am. J. Orthod. Dentofac. Orthop. 155 (4), 473–481 (2019).
Kim, D. W. et al. Prediction of hand-wrist maturation stages based on cervical vertebrae images using artificial intelligence. Orthod. Craniofac. Res. 24 (S2), 68–75 (2021).
Perinetti, G., Braga, C., Contardo, L. & Primozic, J. Cervical vertebral maturation: Are postpubertal stages attained in all subjects? Am. J. Orthod. Dentofac. Orthop. 157 (3), 305–312 (2020).
Kök, H., Izgi, M. S. & Acilar, A. M. Determination of growth and development periods in orthodontics with artificial neural network. Orthod. Craniofac. Res. 24 (S2), 76–83 (2021).
Houston, W. J. B., Miller, J. C. & Tanner, J. M. Prediction of the Timing of the Adolescent Growth Spurt from Ossification Events in Hand—Wrist Films. Br. J. Orthod. 6 (3), 145–152 (1979).
Tekın, A. & Cesur Aydın, K. Comparative determination of skeletal maturity by hand–wrist radiograph, cephalometric radiograph and cone beam computed tomography. Oral Radiol. 36 (4), 327–336 (2020).
Fishman, L. S. Radiographic evaluation of skeletal maturation. A clinically oriented method based on hand-wrist films. Angle Orthod. 52, 88–112 (1982).
Bowden, B. D. Epiphysial changes in the hand/wrist area as indicators of adolescent stage. Aust Orthod. J. 4 (3), 87–104 (1976).
Makaremi, M., Lacaule, C. & Mohammad-Djafari, A. Deep learning and artificial intelligence for the determination of the cervical vertebra maturation degree from lateral radiography. Entropy ;21(12). (2019).
Pyle, S. & Greulich, W. Radiographic Atlas of Skeletal Development of Hand and Wrist 2nd edn (Stanford University Press, 1959).
Nogueira-Reis, F. et al. Determination of the pubertal growth spurt by artificial intelligence analysis of cervical vertebrae maturation in lateral cephalometric radiographs (Oral Surg Oral Med Oral Pathol Oral Radiol, 2024).
Khajah, A. et al. Influence of type of radiograph and levels of experience and training on reproducibility of the cervical vertebral maturation method. Am. J. Orthod. Dentofac. Orthop. 157 (2), 228–239 (2020).
Stiehl, J., Müller, B. & Dibbets, J. The Development of the Cervical Vertebrae as an Indicator of Skeletal Maturity: Comparison with the Classic Method of Hand-wrist Radiograph. J. Orofac. Orthop. / Fortschr. der Kieferorthopädie. 70 (4), 327–335 (2009).
Baccetti, T., Franchi, L. & Mcnamara, J. A. An Improved Version of the Cervical Vertebral Maturation (CVM) Method for the Assessment of Mandibular Growth. Vol. 72, Angle Orthodontist. (2002).
Baccetti, T., Franchi, L. & McNamara, J. A. The Cervical Vertebral Maturation (CVM) method for the assessment of optimal treatment timing in dentofacial orthopedics. Semin Orthod. 11 (3), 119–129 (2005).
Baccetti, T., Franchi, L., De Toffol, L., Ghiozzi, B. & Cozza, P. The diagnostic performance of chronologic age in the assessment of skeletal maturity. Prog Orthod. 7 (2), 176–188 (2006).
Shan, T., Tay, F. R. & Gu, L. Application of Artificial Intelligence in DentistryVol. 100p. 232–244 (SAGE Publications Inc., 2021). Journal of Dental Research.
Dipalma, G. et al. Artificial Intelligence and Its Clinical Applications in Orthodontics: A Systematic Review. Vol. 13, Diagnostics. Multidisciplinary Digital Publishing Institute (MDPI); (2023).
Kondody, R. T. et al. Introduction to artificial intelligence and machine learning into orthodontics: A reviewVol. 12p. 214–220 (Scientific Scholar, 2022). APOS Trends in Orthodontics.
Khanagar, S. B. et al. Developments, application, and performance of artificial intelligence in dentistry – A systematic reviewVol. 16p. 508–522 (Association for Dental Sciences of the Republic of China, 2021). Journal of Dental Sciences.
Franco, A. et al. Diagnostic performance of convolutional neural networks for dental sexual dimorphism. Sci. Rep. 12 (1), 17279 (2022).
Seo, H., Hwang, J., Jeong, T. & Shin, J. Comparison of deep learning models for cervical vertebral maturation stage classification on lateral cephalometric radiographs. J. Clin. Med. ;10(16). (2021).
Kök, H., Acilar, A. M. & İzgi, M. S. Usage and comparison of artificial intelligence algorithms for determination of growth and development by cervical vertebrae stages in orthodontics. Prog Orthod. ;20(1). (2019).
Amasya, H., Yildirim, D., Aydogan, T., Kemaloglu, N. & Orhan, K. Cervical vertebral maturation assessment on lateral cephalometric radiographs using artificial intelligence: Comparison of machine learning classifier models. Dentomaxillofacial Radiol. ;49(5). (2020).
Atici, S. F. et al. Fully automated determination of the cervical vertebrae maturation stages using deep learning with directional filters. PLoS One ;17 (2022). (7 July).
Shoari, S. A. et al. Estimating mandibular growth stage based on cervical vertebral maturation in lateral cephalometric radiographs using artificial intelligence. Prog Orthod. ;25(1). (2024).
Mohammad, N., Muad, A. M., Ahmad, R. & Yusof, M. Y. P. M. Accuracy of advanced deep learning with tensorflow and keras for classifying teeth developmental stages in digital panoramic imaging. BMC Med. Imaging ;22(1). (2022).
Mohammad, N., Muad, A. M., Ahmad, R. & Mohd Yusof, M. Y. P. Reclassification of Demirjian’s mandibular premolars staging for age estimation based on semi-automated segmentation of deep convolutional neural network. Forensic Imaging ;24. (2021).
Akay, G., Akcayol, M. A., Özdem, K. & Güngör, K. Deep convolutional neural network—the evaluation of cervical vertebrae maturation. Oral Radiol. 39 (4), 629–638 (2023).
Radwan, M. T., Sin, Ç., Akkaya, N. & Vahdettin, L. Artificial intelligence-based algorithm for cervical vertebrae maturation stage assessment. Orthod. Craniofac. Res. 26 (3), 349–355 (2023).
Li, H. et al. The psc-CVM assessment system: A three-stage type system for CVM assessment based on deep learning. BMC Oral Health ;23(1). (2023).
Khazaei, M., Mollabashi, V., Khotanlou, H. & Farhadian, M. Automatic determination of pubertal growth spurts based on the cervical vertebral maturation staging using deep convolutional neural networks. J. World Fed. Orthod. ; (2023).

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

A Novel Deep Learning Approach for Cervical Vertebral Maturation Classification

Status:

Version 1

Abstract

Objectives

Methods

Results

Conclusion

Figures

Introduction

Material and Methods

Data Preprocessing

Custom Model Building with ROS

Hyperparameter Tuning with Random Search:

Final Model Architecture

Training the Custom Model

Evaluation

Results

Discussion

Conclusion

Declarations

Authors Contributions Statement

Author Contribution

Data Availability

References

Additional Declarations

Status:

Version 1