Improved Prediction of Clinical Pregnancy Using Artificial Intelligence with Enhanced Inner Cell Mass and Trophectoderm Images

doi:10.21203/rs.3.rs-3204889/v1

Download PDF

Article

Improved Prediction of Clinical Pregnancy Using Artificial Intelligence with Enhanced Inner Cell Mass and Trophectoderm Images

https://doi.org/10.21203/rs.3.rs-3204889/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 08 Feb, 2024

Read the published version in Scientific Reports →

You are reading this latest preprint version

This study aimed to assess the performance of an artificial intelligence (AI) model for predicting clinical pregnancy using enhanced inner cell mass (ICM) and trophectoderm (TE) images. In this retrospective study, we included static images of 2,555 day-5-blastocysts from seven in vitro fertilization centers in South Korea. The main outcome of the study was the predictive capability of the model to detect clinical pregnancies (gestational sac). Compared to the original embryo images, the use of enhanced ICM and TE images improved the average area under the receiver operating characteristic curve for the AI model from 0.716 to 0.741. Additionally, a gradient-weighted class activation mapping analysis demonstrated that the AI model utilizing the enhanced ICM and TE images was able to extract features from crucial areas of the embryo, including the ICM and TE, in 99% (506/512) of the cases. In contrast, the AI model trained on the original images focused on the main areas in only 86% (438/512) of the cases. Our results highlight the potential efficacy of utilizing ICM- and TE-enhanced embryo images in AI models for the prediction of clinical pregnancy.

Health sciences/Health care/Medical imaging

Health sciences/Diseases/Reproductive disorders/Infertility

Globally, one out of six couples suffer from infertility, but the success rate of in vitro fertilization (IVF) is still low at 20 ~ 30% ^1–3. Embryo quality is known to be one of the most important factors that contribute to a successful IVF cycle. Since single-embryo transfer (SET) is frequently performed to avoid complications from multiple pregnancies, selecting the best quality embryo has become essential.

Currently, the most widely adopted method for embryo assessment is morphological evaluation by embryologists, as it is non-invasive, does not require additional instruments, and is strongly correlated with reproductive outcomes ⁴. Several guidelines have been proposed to assess cleavage, morulae, and blastocysts, and the Gardner scale is typically used to evaluate blastocysts in IVF laboratories ^5,6. This scale is based on the morphology of the inner cell mass (ICM), trophectoderm (TE), and developmental stage of blastocysts. In particular, cell structures such as the ICM and TE determine embryonic cell fate, as the ICM gives rise to the fetus and the TE forms the placenta ⁷. On the Gardner scale, high-grade ICMs are prominent and easily discernible, with many cells compacted and tightly packed together On the Gardner scale, high-grade ICMs are prominent and easily discernible, with many cells compacted and tightly packed together. High-grade TEs are characterized by the formation of a cohesive epithelium consisting of many comprising cells.

Several studies have noted the limitations of the subjective nature of morphological scores evaluated by embryologists ^8,9. For example, several studies have reported a high degree of inter- and intra-observer variability among embryologists in scoring embryos. Furthermore, the evaluation criteria may not include all the features representative of embryo quality. Recently, artificial intelligence (AI) techniques have gained attention for embryo evaluation. Embryo images captured from standard microscopes or time-lapse incubators have been used, along with relevant clinical data, to train an AI model. This model provides a predictive value for clinical pregnancy to help embryologists select embryos for transfer. Although further clinical validation is required, recent studies have reported performance levels higher than or comparable to those of human experts in predicting clinical pregnancies ¹⁰.

However, the variable quality and focus of images are common pitfalls of the AI models that have been presented thus far. Most pregnancy prediction models use blastocyst images and previous studies have established a correlation between the morphology of the ICM and TE during clinical pregnancy ^11–14. In previous study, reasonable and stable interpretations were achieved by paying adequate attention to the ICM and TE regions ¹⁵. Additionally, in other areas of medical image analysis, the extraction of meaningful information from relevant areas was reported to be critical ^16,17. Based on this, we hypothesized that the performance of clinical pregnancy models may be enhanced by segmentation guidance on the ICM/TE regions.

In this study, we propose a novel algorithm that reduces the incidence of erroneous predictions by generating enhanced images from segmented ICM and TE images, which are then used to train an AI model. This is the first report to prove that the performance of the AI model can be enhanced by focusing on the ICM and TE regions. Our proposed method was validated using gradient-weighted class activation mapping (Grad-CAM), which emphasizes that the ICM and TE regions are critical for predicting pregnancy.

A t-test results, as shown in Table 1, indicated that the average age of the negative group (36.2 years) was significantly higher than that of the positive group (34.0 years). This age difference was also observed for each of the five clinics, with the negative group consistently having a higher average age than that of the positive group; this difference was statistically significant for all five clinics.

Table 1

Baseline distribution of embryos, and female patient ages in each IVF laboratory
Laboratory	Negative pregnancy		Positive pregnancy		P-value
Laboratory	N	Mean ± SD	N	Mean ± SD	P-value
Total	1743	36.2 ± 3.9	812	34.0 ± 3.1	< 0.001
A	931	36.3 ± 4.0	339	33.4 ± 2.9	< 0.001
B	309	35.8 ± 4.3	216	34.0 ± 3.2	< 0.001
C	325	36.5 ± 3.1	77	34.7 ± 3.0	< 0.001
D	95	36.6 ± 3.6	136	35.6 ± 2.9	0.027
E	83	35.3 ± 4.3	44	33.5 ± 3.8	0.023
IVF, in vitro fertilization; N, sample size; SD, standard deviation.

The areas under the receiver operating characteristic curves (AUROCs) of the AI models were compared using original embryo images and enhanced ICM and TE images. We utilized three representative pre-trained models, DenseNet121 ¹⁸, VGG16 ¹⁹, and ResNet50 ²⁰, which learned from large-scale ImageNet datasets ²⁴, as our convolutional neural network architecture and fine-tuned them for our embryo images. Our results revealed that the ResNet50 architecture achieved the highest performance and that all three models performed better when trained using enhanced ICM and TE images with age information.

The best performance was obtained when utilizing the enhanced ICM and TE images in the ResNet50 architecture, with an average mean (standard deviation) AUROC of 0.741 (0.014) (Table 2). The original image also exhibited the best performance in the ResNet50 architecture, with an AUROC mean and variance of 0.716 (0.019). The boxplots in Fig. 1 show that the enhanced ICM and TE images had better AUROC values than the original images across all three models. The lower quartile (Q1) of the enhanced ICM and TE images was higher than the upper quartile (Q3) of the original images, indicating better performance based on the AUROC metric. In our study, we analyzed the regions that the CNN model learned from using Grad-CAM when predicting pregnancy with the proposed reconstructed image and the original image. Our findings indicated that the ICM and TE regions were learned more intensively when using the reconstructed images instead of the original images. The Grad-CAM results in Figs. 2A and B illustrate a case where the original image lead to an incorrectly predicted outcome, while the respective reconstructed image lead to a correct prediction. In the original image, the features from the ICM and TE regions could not be extracted, leading to an incorrect prediction. Conversely, the reconstructed image model was trained to focus on the ICM and TE regions, which produced an accurate prediction. However, in the case of Figs. 2C and D, the features in the ICM and TE regions within the embryo could be adequately identified in both the original and reconstructed images, however, the predicted outcome was still incorrect.

Table 2

Performance of deep learning models for G-SAC prediction
	CNN Model	AUROC	Accuracy	Sensitivity	Specificity
Original image	VGG16	0.663 (0.014)	0.652 (0.008)	0.690 (0.048)	0.636 (0.025)
	ResNet50	0.716 (0.019)	0.663 (0.012)	0.705 (0.067)	0.645 (0.020)
	DenseNet121	0.653 (0.004)	0.663 (0.013)	0.684 (0.049)	0.654 (0.033)
Reconstructed image	VGG16	0.684 (0.013)	0.654 (0.005)	0.760 (0.042)	0.609 (0.020)
	ResNet50	0.741 (0.014)	0.682 (0.015)	0.714 (0.042)	0.669 (0.028)
	DenseNet121	0.669 (0.004)	0.669 (0.024)	0.694 (0.033)	0.648 (0.032)
G-SAC, gestational sac; CNN, convolutional neural network; AUROC, area under the receiver operating characteristic curve.

We checked Grad-CAM for four out of a total of 512 cases from the test set: when both the original and enhanced ICM and TE images were correctly predicted (n = 314), when both were incorrectly predicted (n = 106), when only the original image was correctly predicted (n = 35), and when only the enhanced ICM and TE images were correctly predicted (n = 57). The number of images focused on learning the ICM and TE regions increased from 85.5% (438/512) to 98.8% (506/512) when the images were enhanced using ICM and TE (Table 3).

Table 3

Grad-CAM results
	Original image Grad-CAM focus		Enhanced ICM and TE image Grad-CAM focus
	ICM/TE N (%)	Other N (%)	ICM/TE N (%)	Other N (%)
Both images correct N = 314	270 (86.0%)	44 (14.0%)	312 (99.4%)	2 (0.6%)
Both images incorrect N = 106	93 (87.7%)	13 (12.3%)	102 (96.2%)	4 (3.8%)
Only the original image correct N = 35	29 (82.9%)	6 (17.1%)	35 (100.0%)	0 (0.0%)
Only enhanced ICM and TE images correct N = 57	46 (80.7%)	11 (19.3%)	57 (100.0%)	0 (0.0%)
Total N = 512	438 (85.5%)	74 (14.5%)	506 (98.8%)	6 (1.2%)
Grad-CAM, gradient-weighted class activation mapping; ICM, inner cell mass; TE, trophectoderm.

This study aimed to determine the effect of segmentation guidance on AI performance in predicting clinical pregnancy using embryo images. AI predictive performance improved when the model was guided by ICM and TE segmentation. Although the overlooking appropriate areas in embryo images has been previously pointed out as an issue of AI, there have been no reported attempts to solve it. To the best of our knowledge, this is the first study to verify improvements in AI predictive performance using ICM and TE segmentation.

Segmentation technology is commonly employed in various medical imaging domains. The utilization of segmentation technology in the computer-aided diagnosis of medical images is increasingly recognized as a valuable and critical aspect of the field of medicine. This approach allows for the effective utilization of large volumes of medical data while mitigating the risk of misdiagnosis resulting from subjective visual observations ^22–24. Research in other medical imaging areas has indicated that AI trained on datasets simplified through segmentation has higher diagnostic performance by focusing on the regions of interest ^17,25,26. Although the operating mechanism of deep learning algorithms remains elusive, it is widely accepted that the less complex the data, the higher the learning efficiency that AI can achieve under the same conditions ²⁷.

The ICM and TE morphologies have long been used worldwide to evaluate and screen embryos. Since these structures play important roles in determining the fate of cells during embryonic differentiation, many studies have investigated the correlation between their morphology and pregnancy rates ^11,13. Several studies have shown that the ICM and TE are also strongly related to the live birth rate, and the ICM is known as a major factor that can predict euploid embryos ^14,28. It seems that, despite expecting the AI model to learn and accurately represent the morphology of ICM and TE by training it with segmented images, the results were not as good as anticipated. In particular, when explaining the inferences of the deep learning model using techniques such as Grad-CAM, we noticed that the model often fails to focus on the ICM or TE. However, using the enhanced ICM and TE images proposed in this study enhances the emphasis on the morphology of such embryonic structures for the deep learning model. This approach provides a strong basis for creating an AI system that can predict clinical pregnancy by incorporating relevant clinical domain knowledge.

One of the major strengths of this study is its high-quality dataset. It is known that the key to developing an AI model is to train it with a sufficient volume of good data, and this study used nationally curated data that was well-refined and labeled from embryo images collected from multiple institutions, which was endorsed by a third-party inspector. Since the dataset was created as part of a government-funded project, quality control was adequately performed, and the original dataset is scheduled to be made public in the future, ensuring external reliability.

The additional task of investigating the incorrectly predicted cases was undertaken in this study, particularly in the four categories shown in Table 3. Two laboratory directors closely examined the original and enhanced ICM and TE images. For clear images, both the original model and the enhanced ICM and TE models correctly predicted the actual outcome. For less-clear images, the segmentation model outperformed the original model. For inherently messy images, both models failed to predict pregnancies. Of the 75 messy images, 25 showed irregular contours of the zona pellucida, embedded sperm in the zona, or cytoplasmic darkness, and the AI may have misinterpreted them as fragmentation. The use of well-focused and clear images ensures a fair performance of the AI model. Furthermore, image reconstruction using ICM and TE segmentation was validated and may be helpful for less-clear images to a certain extent. However, for cases where the original model outperformed the segmentation model, no specific pattern was found in Grad-CAM, and potential explanations include variables such as clinical or genetic information that may help overcome the limitations of morphological assessment.

The Hosmer–Lemeshow test was conducted to assess calibration and provided non-significant results with a p-value < 0.001 and a Brier score of 0.241. To address this issue, we applied isotonic regression and achieved a well-calibrated model with a p-value of 0.265 and a Brier score of 0.178. A calibration plot is shown in Fig. 3. However, even after applying isotonic regression, the true positive rate decreased in the range of high predicted probabilities (0.7–0.9). This has been confirmed to be due to the occurrence of incorrect cases; in these cases, the AI model predicted good-quality embryos after being successfully trained on ICM and TE images, but pregnancy did not occur due to various complex factors. It is important to note that calibration can be more difficult for small sample sizes because there may be insufficient information to accurately estimate the mapping between the predicted scores and probabilities. In such cases, it may be necessary to collect additional data or employ alternative methods to enhance calibration. Therefore, we aim to collect a global dataset that includes both domestic and foreign data and develop a model that is more accurate and robust in future research.

To generalize our results, an international study is required, as our study was conducted on the same racial background. In addition, because our dataset consisted of images from multiple institutions, the color and sharpness of the images varied slightly. Although this allows for better performance in terms of validation compared to training in a single institution, it is still difficult to guarantee the same performance when unfamiliar, heterogeneous images are processed. In addition, to automate our method, we must develop an algorithm that automatically segments the €CM and TE areas. In future research, we plan to develop an ICM and TE segmentation model that automatically segments these areas from the original embryo image, creates an enhanced ICM and TE image, and feeds it as input to the AI model.

In conclusion, this study demonstrates that the predictive performance of AI improves when using enhanced ICM and TE images. AI can provide objective information to empower the assessment of embryologists; however, it often analyzes irrelevant parts of images, leading to incorrect results, particularly when the image is out of focus. Our research findings may help generalize the AI model for application to embryo images with various focuses. Further research can serve as the first step towards transparent AI models for embryo assessment.

Study Design and Data Preparation

In this retrospective study conducted between June 2011 and May 2022, a total of 8,646 blastocyst images were collected on day 5 from seven IVF clinics. Images were captured using an inverted microscope or stereomicroscope before embryo transfer. Blastocysts from fresh and frozen transfers were matched to clinical pregnancy outcomes as determined by the gestational sac at 4–6 weeks. Blastocysts from multiple embryo transfers with positive pregnancy outcomes were excluded to ensure that each image matched the pregnancy outcomes. Images from clinics without pregnancy information and those from the remaining clinics were divided into four groups (A, B, C, and D) with at least 200 images each; the remaining images from the other three clinics were combined into one group € because of insufficient data for statistical analysis. A total of 2,555 images were selected for this study and to develop the algorithms using image preprocessing and model learning techniques. Figure 4 illustrates the overall algorithm development process across the five groups (A-E). The study protocol was approved by the Institutional Review Board (IRB) of the following institutions: Miraewaheemang Hospital (IRB No. 2022-RESEARCH-01), Good Moonhwa Hospital (IRB No. GMH-2022-01), HI Fertility Center (IRB No. HIRB 2022-01), Seoul Rachel Fertility Center (IRB No. RTR-2022-01), Ajou University Hospital (IRB No. AJIRB-MED-MDB-21-716), Pusan National University Hospital (IRB No. 2204-003-113), and Seoul National University Bundang Hospital (IRB No. B-2208-772-104). Informed consent was waived by the IRB of the following institutions: Miraewaheemang Hospital (IRB No. 2022-RESEARCH-01), Good Moonhwa Hospital (IRB No. GMH-2022-01), HI Fertility Center (IRB No. HIRB 2022-01), Seoul Rachel Fertility Center (IRB No. RTR-2022-01), Ajou University Hospital (IRB No. AJIRB-MED-MDB-21-716), Pusan National University Hospital (IRB No. 2204-003-113), and Seoul National University Bundang Hospital (IRB No. B-2208-772-104), since this study was retrospective, and the personal information in the data was blinding. The present study was designed and conducted in accordance with the relevant guidelines and regulations of the ethical principles for medical research involving human subjects, as stated by the WMA Declaration of Helsinki.

Generation of Enhanced ICM and TE Images

The day-5-blastocyst embryo images collected from the IVF clinics were manually annotated by trained personnel to identify the ICM and TE regions, and the accuracy of the annotation was manually examined first by embryologists and then by lab directors with over 20 years of experience. The coordinates of the ICM and TE regions were stored as JSON files, and the Python library OpenCV (version 3.3.1) was then employed to generate segmented images of the ICM and TE regions from the original images. Two experiments were conducted to evaluate the performance of the proposed method. In the first experiment, pregnancy was predicted using only the original embryo microscopy images. In the second experiment, we predicted pregnancy after generating enhanced images using the ICM and TE segmentation images. Enhanced ICM and TE images were created by combining three grayscale images: the original embryo image, the ICM image, and the TE image. Instead of using the original red, green, and blue channels found in a regular color image, each image was converted into a grayscale and treated as a single channel. These three grayscale images were then merged to form the final enhanced ICM and TE images using the three channels. Conventional convolutional neural networks were designed to process images with three channels, and grayscale images were merged into three channels to align them with the original image format (Fig. 5).

Training Data Split and Image Preprocessing

A total of 2,555 images were divided into training data (2,043 images, 80%) and model performance test data (512 images, 20%). We then utilized 3-fold cross-validation to divide the 2,043 images included training data into three folds. Each fold was utilized to train and validate the model, and performance was evaluated using fixed model performance test data. Table 4 illustrates the overall data segmentation and distribution. Before applying the enhanced ICM and TE images to the deep-learning model, image pre-processing was performed to make them suitable for training. During the image pre-processing step, all pixel values of the image were normalized, and the image was resized to 224 × 224 pixels. Since the sample size of the image dataset was not sufficiently large, we attempted to improve the efficiency of image classification by applying image augmentation ^29,30. This made the model more robust by augmenting the images with transformations, allowing it to learn a greater variety of images during training. To perform image augmentation, TensorFlow 2.10.0, a deep-learning neural network API in Python ³¹, was used with the following options:

Random brightness
Random saturation
Random contrast
Flip images vertically
Flip images horizontally

In addition, since a pre-trained model was used, the image was resized to fit the learned size of each pre-trained model.

Table 4

Overall dataset split ratio and 3-fold cross-validation set composition
Fold 1 (n = 2043)	Fold 2 (n = 2043)	Fold 3 (n = 2043)	Test (n = 512)
Validation (n = 681)	Train (n = 681)	Train (n = 681)	Test
Train (n = 681)	Validation (n = 681)	Train (n = 681)	Test
Train (n = 681)	Train (n = 681)	Validation (n = 681)	Test

Statistical analysis

To determine whether there was a significant difference in age between the negative and positive pregnancy groups, a t-test was performed on the entire dataset. Furthermore, binary classifiers were used to estimate the probability of an instance belonging to the positive class using prediction scores; as these scores are often poorly calibrated, they may not accurately reflect the true probabilities ³². The Hosmer–Lemeshow test was used to compare the expected and observed frequencies of the positive class to assess calibration ³³.

Model Development and Evaluation

For the convolutional neural network model, we utilized the original image and the enhanced ICM and TE images and compared their respective performances. Owing to the limited sample size, we fine-tuned the pre-trained model to learn the model and used 224 × 224 images, which is the same size as that utilized in ImageNet. In addition, after concatenating the age of the patient to the last fully connected layer of the architecture, it was configured to return the predicted pregnancy value through the sigmoid layer. The complete process from image input to predicted value return is illustrated in Fig. 6. Model training was conducted on a machine equipped with an Intel Xeon CPU @ 2.10 GHz and an RTX3090 (24 GB) GPU, utilizing the Python programming language (version 3.8.0).

To evaluate the performance of our algorithm, we used four key metrics: sensitivity, specificity, accuracy, and the AUROC. The formulas for these metrics are as follows.

TP = number of true positive samples

TN = number of true negative samples

FP = number of false positive samples

FN = number of false negative samples

Sensitivity =\(\frac{TP}{TP+FN}\)

Specificity =\(\frac{TN}{TN+FP}\)

Accuracy =\(\frac{TP+TN}{TP+FP+TN+FN}\)

Grad-CAM is a technique used in the field of computer vision to understand and visualize important image regions contributing to deep neural network prediction. It aims to generate a heat map that highlights the input image regions that are the most important for a particular class prediction. By generating these heat maps, Grad-CAM provides insights into the internal workings of deep neural networks, allowing researchers and practitioners to interpret and understand the features learned by the network. In this study, two embryologists with more than 20 years of expertise as lab directors visually inspected Grad-CAM and confirmed whether the AI model exhibited a higher focus on the ICM and TE regions when utilizing enhanced ICM and TE images.

ACKNOWLEDGMENTS

This research used datasets from The Open AI Dataset Project (AI Hub, South Korea). All data information can be accessed through ‘AI-Hub’ (www.aihub.or.kr). The authors thank Yoon Ha Kim M.D., Ph.D., Seul Ki Kim M.D., Jongkil Joo M.D., Ph.D., and all those in charge of the institutions involved who assisted with data collection and inspection.

Author contributions

Conceptualization: H.M.K., and H.J.L. Methodology: T.K., H.K., S.C., J.H.P., and M.K.C. Software: H.M.K. Validation: J.H.P., M.K.C., and H.J.L. Formal Analysis: H.M.K. Acquisition of data: J.H.P., M.K.C., M.K., and N.Y.K. Writing—Original Draft Preparation: H.M.K. Writing—Review and Editing: All authors. Visualization: H.M.K. Supervision: H.J.L. All authors have read and agreed to the published version of the manuscript.

Funding Statement

This study received no specific grant funding from any funding agency in the public, commercial or not-for-profit sectors.

Competing Interests

H.M.K., H.K. and S.C. have nothing to disclose. T.K. reports stock options and consulting fees from Kai Health. J.H.P and M.K.C report consulting fees from Kai Health. M.K. and N.Y.K have nothing to disclose. H.J.L reports stock options from Kai Health.

Data Availability Statement

All data information can be accessed through ‘AI-Hub’ (www.aihub.or.kr).

Wade, J. J., MacLachlan, V. & Kovacs, G. The success rate of IVF has significantly improved over the last decade. Aust. New Zeal. J. Obstet. Gynaecol. 55, 473–476 (2015).
De Mouzon, J. et al. Assisted reproductive technology in Europe, 2006: results generated from European registers by ESHRE. Hum. Reprod. 25, 1851–1862 (2010).
Meldrum, D. R., Silverberg, K. M., Bustillo, M. & Stokes, L. Success Rate with Repeated Cycles of In Vitro Fertilization–Embryo Transfer. Fertil. Steril. 69, 1005–1009 (1998).
Gardner, D. K., Lane, M. & Schoolcraft, W. B. Culture and transfer of viable blastocysts: a feasible proposition for human IVF. Hum. Reprod. 15 Suppl 6, 9–23 (2000).
Balaban, B. et al. The Istanbul consensus workshop on embryo assessment: Proceedings of an expert meeting. Hum. Reprod. 26, 1270–1283 (2011).
Gardner, D. K. & Schoolcraft, W. B. Culture and transfer of human blastocysts. Curr. Opin. Obstet. Gynecol. 11, 307–311 (1999).
Piliszek, A., Grabarek, J. B., Frankenberg, S. R. & Plusa, B. Cell fate in animal and human blastocysts and the determination of viability. Mol. Hum. Reprod. 22, 681–690 (2016).
Storr, A., Venetis, C. A., Cooke, S., Kilani, S. & Ledger, W. Inter-observer and intra-observer agreement between embryologists during selection of a single Day 5 embryo for transfer: A multicenter study. Hum. Reprod. 32, 307–314 (2017).
Bormann, C. L. et al. Consistency and objectivity of automated embryo assessments using deep neural networks. Fertil. Steril. 113, 781-787.e1 (2020).
Ver Milyea, M. et al. Development of an artificial intelligence-based assessment model for prediction of embryo viability using static images captured by optical light microscopy during IVF. Hum. Reprod. 35, 770–784 (2021).
Ai, J. et al. The Morphology of Inner Cell Mass Is the Strongest Predictor of Live Birth After a Frozen-Thawed Single Embryo Transfer. Front. Endocrinol. (Lausanne). 12, 1–10 (2021).
Bakkensen, J. B. et al. Association between blastocyst morphology and pregnancy and perinatal outcomes following fresh and cryopreserved embryo transfer. J. Assist. Reprod. Genet. 36, 2315–2324 (2019).
Chen, X. et al. Trophectoderm morphology predicts outcomes of pregnancy in vitrified-warmed single-blastocyst transfer cycle in a Chinese population. J. Assist. Reprod. Genet. 31, 1475–1481 (2014).
Irani, M. et al. Morphologic grading of euploid blastocysts influences implantation and ongoing pregnancy rates. Fertil. Steril. 107, 664–670 (2017).
Loewke, K. et al. Characterization of an artificial intelligence model for ranking static images of blastocyst stage embryos. Fertil. Steril. 117, 528–535 (2022).
Alakwaa, W., Nassef, M. & Badr, A. Lung cancer detection and classification with 3D convolutional neural network (3D-CNN). Int. J. Biol. Biomed. Eng. 11, 66–73 (2017).
Min Kim, H., Ko, T., Young Choi, I. & Myong, J. P. Asbestosis diagnosis algorithm combining the lung segmentation method and deep learning model in computed tomography image. Int. J. Med. Inform. 158, 104667 (2022).
Huang, G., Liu, Z., Van Der Maaten, L. & Weinberger, K. Q. Densely connected convolutional networks. Proc. - 30th IEEE Conf. Comput. Vis. Pattern Recognition, CVPR 2017 2017-Janua, 2261–2269 (2017).
Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. 3rd Int. Conf. Learn. Represent. ICLR 2015 - Conf. Track Proc. 1–14 (2015).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. 2016-Decem, 770–778 (2016).
Smirnov, E. A., Timoshenko, D. M. & Andrianov, S. N. Comparison of Regularization Methods for ImageNet Classification with Deep Convolutional Neural Networks. AASRI Procedia 6, 89–94 (2014).
Zhang, Q. et al. A comparative study of attention mechanism based deep learning methods for bladder tumor segmentation. Int. J. Med. Inform. 171, 104984 (2023).
Hosseinzadeh Kassani, S., Hosseinzadeh Kassani, P., Wesolowski, M. J., Schneider, K. A. & Deters, R. Deep transfer learning based model for colorectal cancer histopathology segmentation: A comparative study of deep pre-trained models. Int. J. Med. Inform. 159, 104669 (2022).
Bayramoglu, N., Nieminen, M. T. & Saarakkala, S. Machine learning based texture analysis of patella from X-rays for detecting patellofemoral osteoarthritis. Int. J. Med. Inform. 157, 104627 (2022).
Albert, B. A. Deep Learning from Limited Training Data: Novel Segmentation and Ensemble Algorithms Applied to Automatic Melanoma Diagnosis. IEEE Access 8, 31254–31269 (2020).
Rojas Domínguez, A. & Nandi, A. K. Toward breast cancer diagnosis based on automated segmentation of masses in mammograms. Pattern Recognit. 42, 1138–1148 (2009).
Hu, X., Chu, L., Pei, J., Liu, W. & Bian, J. Model complexity of deep learning: a survey. Knowl. Inf. Syst. 63, 2585–2619 (2021).
Liu, H. et al. Development and evaluation of a live birth prediction model for evaluating human blastocysts from a retrospective study. Elife 12, (2023).
Xu, M., Yoon, S., Fuentes, A. & Park, D. S. A Comprehensive Survey of Image Augmentation Techniques for Deep Learning. Pattern Recognit. 137, 109347 (2023).
Guan, Q. et al. Medical image augmentation for lesion detection using a texture-constrained multichannel progressive GAN. Comput. Biol. Med. 145, 105444 (2022).
Singh, P. & Manure, A. Introduction to TensorFlow 2.0. Learn TensorFlow 2.0 1–24 (2020) doi:10.1007/978-1-4842-5558-2_1.
Huang, Y., Li, W., Macheret, F., Gabriel, R. A. & Ohno-Machado, L. A tutorial on calibration measurements and calibration models for clinical prediction models. J. Am. Med. Informatics Assoc. 27, 621–633 (2021).
Jiang, X., Osl, M., Kim, J. & Ohno-Machado, L. Calibrating predictive model estimates to support personalized medicine. J. Am. Med. Informatics Assoc. 19, 263–274 (2012).

Competing interest reported. H.M.K., H.K. and S.C. have nothing to disclose. T.K. reports stock options and consulting fees from Kai Health. J.H.P and M.K.C report consulting fees from Kai Health. M.K. and N.Y.K have nothing to disclose. H.J.L reports stock options from Kai Health.

Download PDF

Journal Publication

published 08 Feb, 2024

Read the published version in Scientific Reports →

Editorial decision: Major revision
19 Sep, 2023
Reviews received at journal
15 Sep, 2023
Reviewers agreed at journal
31 Aug, 2023
Reviewers invited by journal
16 Aug, 2023
Editor assigned by journal
11 Aug, 2023
Editor invited by journal
03 Aug, 2023
Submission checks completed at journal
03 Aug, 2023
First submitted to journal
26 Jul, 2023

You are reading this latest preprint version

Improved Prediction of Clinical Pregnancy Using Artificial Intelligence with Enhanced Inner Cell Mass and Trophectoderm Images

Status:

Journal Publication

Version 1

Abstract

Figures

INTRODUCTION

RESULTS

DISCUSSION

MATERIALS AND METHODS

Study Design and Data Preparation

Generation of Enhanced ICM and TE Images

Training Data Split and Image Preprocessing

Statistical analysis

Model Development and Evaluation

Declarations

References

Additional Declarations

Status:

Journal Publication

Version 1