A Deep Dive into GI Tract Imaging Transformation through Advanced Image Segmentation Analysis with Deep Learning

doi:10.21203/rs.3.rs-3854649/v1

Download PDF

Article

A Deep Dive into GI Tract Imaging Transformation through Advanced Image Segmentation Analysis with Deep Learning

https://doi.org/10.21203/rs.3.rs-3854649/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

The reconstruction of computed gastrointestinal tract tomography images has been a vibrant field of study, particularly with the emergence of deep learning techniques. These methods leverage data-driven models to enhance the quality of reconstructions. Our research delves into this domain by conducting a comprehensive data challenge, where various deep learning algorithms were assessed using extensive public datasets. The focal point was on quantitatively evaluating these methods. A noticeable outcome of our investigation is the substantial enhancement in reconstruction quality metrics achieved by deep learning-based approaches, both in applications involving computed tomography (CT) and using methods such as Region-CNN (RCNN) and Conditional Invertible Neural Networks (CINN). We also delve into crucial selection criteria for these methods, encompassing factors like the availability of training data, understanding the physical measurement model, and the speed of reconstruction. The prevailing technique for segmenting three-dimensional tract images relies on convolutional networks and Conditional Invertible Neural Networks. Yet, these advanced architectures, including CNN, RNN, and CINN, impose heavy computational demands, necessitating GPU-accelerated workstations for rapid inference. This research work introduces a novel segmentation method employing a human-like strategy for 3D segmentation where initially analyzes the image at a small scale to pinpoint areas of interest, subsequently processing only pertinent feature-map patches. This innovation drastically reduces inference time and all while upholding state-of-the-art segmentation quality.

Health sciences/Health care

Physical sciences/Engineering

Augmentation

Image reconstruction

deep learning

GI Tract

Conditional Invertible Neural Networks

Region Neural Network

One of the most important uses of image analysis is in medicine. X-rays, MRI scans, and CT scans create a vast quantity of visual data that must be analyzed in order to diagnose illnesses, monitor treatment progress, and identify possible areas of concern. [1] Image analysis techniques like segmentation and classification may be used to identify specific regions of interest within medical pictures, helping doctors to make more accurate diagnosis and treatment decisions. Image analysis is also commonly used in biology to research cellular processes, identify different types of cells, and evaluate photographs of biological material. Picture analysis can assist researchers in better understanding the intricate structures and processes that occur within cells, leading to new discoveries. Image analysis is a main pillar of computer science that deals with the manipulation, interpretation, and comprehension of digital pictures. [2] Image analysis has become a crucial tool in many industries, including health, biology, engineering, and computer science, to mention a few, as technology has advanced, and visual data has been more readily available. Image analysis is a multi-step process that begins with picture capture. Images can be obtained from a variety of sources, including digital cameras, scanners, and medical imaging devices. [3] When a picture is captured, it is pre-processed, which entails improving the image's quality by eliminating noise, increasing contrast and brightness, and doing other modifications to prepare it for analysis. Finding forms, recognizing edges, eliminating noise, counting objects, and generating statistics for texture analysis or picture quality are examples of image analysis activities. [4] Matching, segmentation, form analysis, and description are major areas of image analysis methodology. Pattern recognition, however, deals with the analysis of categorization methods and emphasizes the utilization of statistical, syntactic, and combined techniques from the two aforementioned approaches. When utilizing images for research, study, or presentations, it becomes essential to subject them to thorough analysis and evaluation. [5] Similar to evaluating journal papers or books, images should undergo scrutiny to determine their quality, reliability, and suitability for the intended purpose. Image analysis has become increasingly popular in recent years due to its ability to provide insights and information that cannot be obtained through other methods. Because of this, it has found widespread use in a variety of fields such as medicine, scientific research, surveillance systems, and industrial quality control. Image analysis allows for the automatic extraction of complex information from images, providing an efficient and reliable way to analyze large datasets. It has also greatly improved the accuracy and speed of processes such as object recognition, facial recognition, and feature detection. [6] As image analysis technologies continue to advance and improve, they have the potential to revolutionize a wide range of industries by providing solutions to complex problems that were previously unsolvable. The next phase in image analysis is feature extraction, which involves identifying the important aspects or properties of the picture that are valuable for analysis. [7] Color, texture, form, and other visual qualities that may be utilized to discriminate between various objects or regions within an image are examples of these attributes. Image segmentation is conducted after feature extraction, which entails splitting the picture into areas or items of interest. This may be accomplished using several techniques such as thresholding, clustering, or edge detection, allowing researchers to isolate certain portions of the picture for further examination. Once the image is segmented, the next step is classification, where labels or categories are assigned to the different regions or objects within the image. This can be done using machine learning or other automated techniques, or it may involve manual annotation by human experts. [8] The final step in image analysis is interpretation, where the results of the analysis are analyzed to draw conclusions or make decisions. This may involve identifying patterns, trends, or relationships within the data, or it may involve using the analysis to make predictions or inform decision-making processes. Image analysis is particularly important in engineering, where it is used to analyze stress patterns in materials, discover faults in manufacturing processes, and perform quality control inspections on final goods. Engineers can discover possible problems early in the production process by employing image analysis tools, lowering the chance of costly mistakes, and guaranteeing that goods satisfy the needed quality requirements. Image analysis is used in computer science for tasks such as detecting objects in photos, producing facial recognition software, and constructing virtual reality settings. Image analysis is an important component in many computer vision applications because it allows computers to analyze and understand visual input in the same manner that people do. Additionally, image analysis can also be used to detect and diagnose diseases by processing medical images such as X-rays, MRIs, and CT scans. Furthermore, image analysis has applications in fields such as agriculture, forestry and environmental science to monitor plant growth, detect deforestation and determine land use patterns. In summary, image analysis is a powerful and versatile tool that offers numerous benefits across various fields by providing valuable insights, increasing efficiency and accuracy of processes, and enabling the discovery of new knowledge. Whether you are looking to improve the performance of a product or service, gain new insights into complex problems, or automate and streamline processes, image analysis offers a solution with limitless potential. Whether it's through advanced algorithms or innovative technologies, image analysis is changing how we understand and interact with the world around us. Through the extraction and processing of data from images, image analysis provides a powerful platform for innovation and discovery that has yet to be fully realized. With the continued development of image analysis technologies and increasing availability of large datasets, the potential for breakthroughs in fields ranging from healthcare to autonomous navigation will continue to expand and evolve. [9] The use of image analysis is becoming increasingly widespread, and its potential applications are limitless. The realm of medical-image analysis for digitally assisted diagnostic and treatment, image segmentations stand as pivotal initial step, carrying significant weight. The intricate nature of medical images renders this task complex and challenging. For instance, even experienced professionals encounter difficulties in accurately identifying Multiple Sclerosis-Lesions of MRI scans due to variations in Lesions characteristic like locations, sizes, and shapes, compounded by anatomically difference among patient [10]. The shift from manual to automated segmentation has gained momentum due to resource-intensive manual efforts. Among the array of automated segmentation techniques, Convolutional neural-networks is proven highly effective on enhancing Segmentation accuracies with reliability. Yet, the reliance of existing CNNs on extensive training data poses a constraint due to the expenses and privacy concerns associated with manual annotation. In response to this hurdle, the realms of Domain Adaptation and Few shot learnings have been extensively explored in the context of semantic segmentations. Domain adaptation centers on adapting a model, initially trained on a source domain, to the target domain using a limited subset of target domain data. Few-shot learnings, on the other hand, seeks to imbue models with the ability to grasp the Pattern of novel concepts-unseen during Trainings, provided only a scant number of labeled instances. [11] Drawing inspiration from prior research on domain adaptation and few shot learnings, a compelling question arises: can We recalibrate an optimized algorithms to empower segmentation models to excel even when provided only a handful of examples? The proposed meta-learning approach endeavors to cultivate alignment between Source and Target-data within the domain agnostic, a space of discriminating features. This technique exhibits applicability across a spectrum of Learning problems encompassing Classifications, Regressions, and enhancing understanding. Within the domain of MRI imaging, terminology like "Low," "Inter-mediate," and "High", Signal Intensity holds significance. According to the specifics of the scanning protocol, different tissue types manifest as white when exhibiting high signal intensity, gray when displaying Intermediate Signal intensity, and Dark-Gray or Black when showing Low Signal Intensity. Our primary concern centers in a category of tissue-type that undergoes frequent movement: tissues like the heart, which pulsates, or the colon, which experiences motion during digestion. This dynamic movement introduces a spectrum of image features encompassing factors such as Locations, Sizes, Shapes, and the influence in the surroundings context. Notably, these image features are intricately described by a diverse range of Signal-Intensity, including High, Inter-mediate, and Low. The potential for learning is heightened when we can glean insights by this intricate diversity of Image-Feature that characterizes particular tissue types, while simultaneously accommodating their varying signal intensities. For illustrative purposes, consider Fig. 1, which portrays two distinct tissue types. Each of these tissue types exhibits a plethora of image features, including their spatial arrangement, dimensions, shapes, and effects on neighboring regions. [12] It is a method that is widely used to produce textures, landscapes, and other visual effects that seem natural. The fundamental idea behind gradient noise is to generate random values at various grid points and then interpolate between them using the gradient of those values to produce a smooth, continuous texture. In order to create gradient noise, a two-dimensional or three-dimensional space must first be divided into a grid of cells. At each location in the grid, the gradient of the values assigned to each cell is determined. The vector pointing from the present position to the nearest point on the gradient is all that it is. Several various applications, such as video games, computer graphics, and scientific visualization, make extensive use of gradient noise methods. [13] They may be used to produce a variety of effects, from rocky mountain ranges to chaotic ocean waves, and are very helpful for creating textures and landscapes that appear realistic. Perlin noise, created by Ken Perlin in the 1980s, is one well-liked kind of gradient noise. Perlin noise is a particular kind of gradient noise that employs cubic interpolation, a particular kind of interpolation that yields smoother results than linear interpolation. In computer graphics, it is widely used to produce realistic-looking textures and surfaces. There are several different gradient noise techniques that are utilized in computer graphics and video game development in addition to Perlin noise and Simplex noise. Among these are Voronoi noise and Worley noise, which produce patterns dependent on the arrangement of Voronoi cells and the distances between them, respectively. Simplex noise, created by Ken Perlin in the 2000s, is another well-liked variety of gradient noise. In addition to being quicker and more effective than Perlin noise, simplex noise is also better suited for application in high-dimensional areas. It is frequently employed in tasks like texture creation, procedural animation, and terrain generation. Finally, gradient noise is an effective and flexible technique for producing realistic-looking textures and landscapes in computer graphics and game development. [14] It is feasible to produce intricate, detailed textures with a minimum of computing cost by generating random values at various positions in a grid and utilising the gradient of those values to interpolate between them. Gradient noise is projected to become more popular in the future due to its versatility and adaptability for real-time applications. Gaussian filter is a convolution-based smoothing and noise reduction technique used in image processing. This filter works by applying a kernel (which is based on the Gaussian distribution) to each pixel in the image, effectively reducing high-frequency noise and preserving the edges and important details in the image. The Gaussian filter is widely used in various fields, such as signal processing and computer vision, to enhance images for better analysis and interpretation. In addition to its use in image processing, the Gaussian filter is also utilized in data analysis and machine learning algorithms for feature extraction and regularization purposes. The Gaussian filter's effectiveness in reducing noise while preserving important details makes it a valuable tool for enhancing the accuracy of algorithms that rely on image or data analysis. Furthermore, the choice of kernel size and standard deviation in the Gaussian filter allows for customization based on specific image and data analysis requirements. [15] Overall, the Gaussian filter is a powerful technique that enables accurate and efficient image smoothing and noise reduction while preserving essential details. By averaging the color values of nearby pixels, the Gaussian filter blurs a picture when it is applied. The Gaussian distribution's standard deviation governs the degree of blurring that is applied to the picture. Blurring increases with a greater standard deviation number and decreases with a lower standard deviation value. [16] The fact that the Gaussian filter is a linear filter, making it simple to use and computationally effective, is one of its key advantages. It may be applied in two independent passes, first horizontally and then vertically, because it is a detachable filter. Applying it to huge photos is now significantly more effective as a result. The Gaussian filter has the advantage of being a smoothing filter without ringing effects, which are typical of other filter types like the mean filter. The Gaussian filter's inability to effectively preserve picture borders i s one of its disadvantages, though. The edges of a picture are blurred when the Gaussian filter is used, which can cause the image to lose information and clarity. Some filters, such the bilateral filter or the edge-preserving filter, can maintain edges.

E. Afgan et al., [17] has gained global adoption among scientists working with extensive biomedical datasets in Proteomics, and Genomics, both Imaging and Metabolomics. Since its inception Galaxy in 2005, has focused on addressing three significant difficulties in data-driven biomedical research: democratizing analysis access, ensuring reproducibility, and facilitating communication for analysis reuse and extension. Over the past two years, significant improvements have been achieved across Galaxy's primary structure, toolset, user interface, and training materials. With over 5500 tools accessible through the Galaxy Toolshed, Galaxy is now capable of handling large datasets. High-quality tutorials crafted by the community guide users through common genomic analysis processes. The thriving Galaxy ecosystem encompasses a growing number of public servers, active developer contributions, and users benefiting from the main Galaxy server. Furthermore, there is an escalating global trend towards Portable and Reproducible Genomics-Analysis Pipeline, including in Africa. Collaborative initiatives Like the human health and heredity-consortium are driving this trend.

M. Angelo etal., [18] proposed that Immunohistochemistry serves as a valuable tool for visualizing protein expression in the diagnostic evaluation of various solid tissue cancers. The conventional IHC techniques employ antibodies linked to fluorophores or enzyme markers, producing colored signals. A novel technique was introduced, utilizing secondary-Ion mass spectrometry’s to Visualizes antibody-tagged with isotopic purely element metal-markers. This ground-breaking technique is known as multiplexed ion beam imaging enables the concurrent analysis of as many as 100 targets, covering a wide dynamic range spanning five logarithms. In our investigation, we harnessed the power of MIBI to evaluate sections of human breast tumor tissue revealing insights into disease mechanisms pertinent to basic research, drug development, and clinical diagnostics. The noise algorithm is a type of algorithm used in computer graphics to create patterns and textures that simulate the natural randomness seen in the real world. It works by generating a series of random values that can be used to create variations in color, shading, or other visual elements. Typically, the noise algorithm is used in conjunction with other graphic algorithms to generate more complex images or animations. One popular type of noise algorithm is the Perlin noise algorithm, which was invented by Ken Perlin in 1983. This algorithm remains widely used today and has been adapted to create a variety of different visual effects in movies, video games, and other digital media. Some common applications of Perlin noise include creating realistic terrain features, generating cloud formations, and simulating the movement of water or fire. Overall, the noise algorithm is a powerful tool that can help designers and developers create more dynamic and compelling graphics. The noise algorithm is a type of procedural texture that creates natural-looking randomness in computer graphics. This randomness is useful for creating patterns and textures that simulate real-world objects or phenomena, such as clouds, fire, or terrain. The Perlin noise algorithm is a prominent type of noise algorithm that is often used to generate complex images or animations in digital media.

Anuradha Nayak etal [19] showed the science of image processing is now in its infancy. The picture is used in a variety of scientific sectors, including biomedicine, security, education, and space exploration. A significant barrier is the phenomena of picture corruption brought on by noise. The transmission procedure from one location to another or it might occur during the capture process are the two primary causes of noise appearance. However, there are a variety of methods for modifying the picture data to eliminate noise and restore image quality. An overview of the main noise kinds is provided in this article. Also included in the research are well-known de-noising techniques, particularly the two fundamental ones, spatial domain and transform domain with its subparts.

chih-yuan lien et al [20] applied image processing, including in medical imaging, scanning methods, printing techniques, license plate identification, face recognition, and other fields. The decision tree-based impulse detector is used with noisy images. When a picture is discovered to be corrupted, an edge-preserving image filter is used to provide a value that has been rebuilt. Otherwise, the image is unaltered. Then histogram equalization is used to enhance the image's quality. When a picture is determined to be damaged via decision-tree-based denoising and image enhancement, it is replaced with a noise-free value, and clarity is increased using histogram equalization. The denoising and histogram equalization is particularly well suited for VLSI implementation due to the simplicity of the related process. In this article, the VLSI architecture for DTBDM on FPGA is implemented.

Cheng Wang etal., [21] found that Computer vision technology has extensive use across diverse domains within agricultural advancement. As computer technology, graphical capabilities, and image processing techniques have swiftly evolved, significant headway has been made in applying this technology to agricultural contexts. This article comprehensively examines and consolidates the evolution of computer vision technology, primarily focusing on its utilization in identifying and assessing the quality of agricultural products, monitoring crop growth, automating agricultural processes, detecting crop diseases, and other pertinent areas. Furthermore, the paper deliberates on the potential trajectory for future advancements in this field.

H. Denk et al [22] developed that Immunofluorescence and the peroxidase-antiperoxidase method utilizing unlabeled antibodies were utilized on conventionally paraffin-embedded kidney biopsy samples fixed with formalin. The study outlines the optimal conditions for these techniques. Both approaches effectively revealed deposits of immunoglobulins and complement within glomeruli. Comparable sensitivity was observed when employing these methods on dewaxed sections treated with pronase. Pronase treatment contributed to a reduction in non-specific background fluorescence and an augmentation in the sensitivity of both immune morphological techniques, potentially due to an enhancement of antigenicity.

Y. Guo et al., [23] developed an approach targeted at improving the accuracy of cervical cancer screening using cervicograms. Despite their common use, cervicograms often yield a considerable rate of misdiagnosis, with even skilled specialists exhibiting only 48% specificities during Clinical-Evaluations. Many existing methods rely solely on Single View Images, using as an input, use acetic acid or Lugol’s iodine solution. Unfortunately, these approaches overlook the potential for false-positive reactions in non-pathological tissues present in these monochrome pictures. This oversight may result in inaccurate diagnoses during Clinical- assessments. The necessity to extract characteristics from multi-view colposcopy pictures therefore emerges, encompassing the Original-visuals, to ensure accuracy. These two attention mechanisms synergistically bolster the feature representation of High Grade squamous Intraepithelial-Lesions (HSIL). Leverage a dataset of 3294 Clinical-Cervigrams, our approach yields impressive results, achieving Recall, specificity, and F1-Score measures were 87.1%, 93.0%, and 89.7%, respectively, with an accuracy of 90.0%. Empirical findings substantiate the effectiveness of this method, showcasing its potential to aid clinicians in Precise-Disease classifications and Diagnosis. Notably, our approach Outperform established relative techniques.

M. Kouroupis [24] proposed that Diabetic Retinopathy (DR) stands as a prevalent ailment among diabetic individuals, impacting retina health and visual acuity. Given its irreversibility, early detection of a dr can significantly reduce the chance of eyesight loss impairment. A Manual identification from retinal fundus pictures, and DR categorization are labor-intensive, costly, and susceptible to inaccuracies, underscoring the need for Computer-Aided Detection (CAD) models. Most Recently, Deep Learning (DL) models have emerged as pivotal tool on diverse domains, notably Medical Image-Analysis. In the context, this research article introduces a novel DL-powered approach termed, Model called "Deep Learning Diabetic Retinopathy Detection and Classification" (DL-DRDC). The DL-DRDC approach uses retinal fundus pictures to identify and classify different degrees of DR. The Contrast Limited Adaptive Histogram Equalisation (CLAHE) approach is used as a pre-processing step in the suggested methodology. By addressing the inherently poor contrast of medical pictures, this step improves the contrast of fundus images. Additionally, the L channel of the retinal images only receives CLAHE. augmenting increased contrast. Moreover, an EfficientNet-based deep learning architecture is employed as a feature extractor to derive feature vectors extracted from the processed pictures. Subsequently, The Deep Neural-Network (DNN) functions as the categorized ideal, assigning appropriate DR- Stage. The study conducts a comprehensive array of experiments utilising the MESSIDOR benchmark dataset. The evaluation encompasses various performance metrics. Simulation outcomes underscore the superior diagnostic efficacy of the Comparing the DL-DRDC method to existing techniques. G. Ligabue used the most prevalent primary glomerular ailment worldwide, displaying diverse clinical and pathological manifestations and serving as a significant contributor to end-stage renal disease. Timely identification and efficient interventions are imperative to enhance IgAN outcomes. Machine learning techniques offer the potential to automate and enhance the accuracy of pathological analysis, early detection, diagnosis, and prognosis prediction for IgAN. This article delves into the utilization of machine learning methods for IgAN, spanning from refining pathological diagnoses to uncovering non-invasive biomarkers, projecting disease progression, and assessing prognosis. These machine learning approaches hold the key to curbing the incidence and mortality rates of end-stage renal disease. Leveraging intelligent image analysis techniques such as VGG16, this article highlights the potential of accurate IgA nephropathy detection and diagnosis, paving the way for clinicians to adopt effective preventive and therapeutic measures. Y. Su, proposed that Lung cancer represents a global health concern, with lung nodules being a primary indicator of early-stage disease. The automated identification of lung nodules not only alleviates radiologist workload but also reduces the risk of misdiagnosis and overlooking cases. To address this, we propose the utilization of the Faster R-CNN algorithm for accurate lung nodule detection.

Zhao et al, [25] underscored the significance of magnetic resonance (MR) image guide Radiation-Therapy. This innovative field revolves around utilizing MR imaging to create Synthetic Compute Tomography-images such precision Radiation-Therapy planning. Integrating convolutional base Generative Adversarial-Networks has exhibited promise in generating CT images from MR data, owing to advancements in Deep- Learning. However, due-to limitation inherent in purely convolutional neural network structures and the localized disparities between MR and CT image—particular in pelvic softly tissues—there's a need for enhancements to optimize GAN performance in CT synthesis from MR. In his study, a pioneering GAN termed Residual Transformer Conditional GAN, novel approach capitalizes on the strengths of CNNs for capturing local texture details and harnesses the Transformer architecture for capturing global correlations, thereby extracting Multilevel Feature from both MR and CT image. Additionally, they incorporated a feature reconstruction loss to refine image features, mitigating issues of over-smoothing and local distortions within SCT images. Empirical findings demonstrate that RTCGAN achieves a remarkable visual similarity to reference CT images and notably improves outcomes for local mismatched tissues. Quantitative assessments showcase impressive results, with RTCGAN surpassing other comparative methods in terms of MAE and Peak Signal-to-Noise Ratio. In a related to this they developed an advanced approach for identifying precancerous gastric cancer. Their strategy involved the fusion of superficial and deep features extracted from gastroscopic images. The primary objective was to optimize the utilization of these features, empowering clinicians with enhanced clinical decision support for diagnosing precancerous gastric diseases, ultimately leading to a reduction in doctor workloads. A set of 75-dimensional shallow features was meticulously designed based on gastroscopic image characteristics, including histograms, textures, and high-order features.

Tomasi et al [26] proposed the bilateral filtering technique effectively smooths images while retaining their edges, achieved through a non-linear fusion of neighboring image values. This approach is locally applied, straightforward, and devoid of iterative steps. It harmonizes gray levels or colors by considering their spatial proximity and photometric likeness. It prioritizes values that are close both in terms of position and their magnitude. Unlike filters that treat color image bands separately, bilateral filtering adheres to the perceptual principles of the CIE-Lab color space. It harmonizes colors, preserving edges in a manner aligned with human perception. Furthermore, bilateral filtering eliminates the emergence of spurious colors along edges in color images, and diminishes their presence in the original image, setting it apart from conventional filtering techniques. Paris et al applied Bilateral filtering stands as one of the most widely used image processing techniques. This nonlinear method effectively blurs images while preserving prominent edges. Its remarkable ability to decompose images into distinct scales without inducing undesirable artifacts has positioned it as a fundamental tool in computational photography domains like tone mapping, style transfer, relighting, and denoising. "Bilateral Filtering: Theory and Applications" furnishes a visually intuitive introduction to this method, offering practical insights into efficient implementation, an exploration of its diverse applications, and a mathematical analysis. Encompassing a comprehensive scope, this resource addresses both theoretical and practical aspects, catering to the needs of researchers and software developers alike.

W. Hu et al., [27] has garnered significant attention within the healthcare sector. Advances in computer vision (CV) and deep learning (DL) technologies have paved the way for the development of automated GC diagnostic models. This study introduces an innovative method termed Manta Ray Foraging Optimization Transfer Learning for Gastric Cancer Diagnosis and Classification, utilizing endoscopic images. To elevate image quality, the approach incorporates a Wiener filter to reduce noise. Within the MRFOTL-GCDC framework, the Manta Ray Foraging Optimization algorithm, in conjunction with the SqueezeNet model, is harnessed to extract feature vectors. This approach eliminates the need for labor-intensive trial-and-error hyperparameter tuning, leading to enhanced classification performance. Subsequently, the Elman Neural Network (ENN) model is employed for accurate GC classification. A comprehensive simulation analysis is conducted to vividly illustrate the enhanced performance of the MRFOTL-GCDC technique. Comparative evaluations affirm the efficacy of this technique for endoscopic image classification, showcasing a remarkable accuracy enhancement reaching 99.25%. Employing constructed convolutional neural networks like ResNet and GoogLeNet, a fully connected deep feature layer was added before the output layer. These features were then combined and fed into a machine learning classifier for identifying various gastric conditions, including polyps, ulcers, and erosions. Results from testing on a dataset of 420 images for each condition revealed that the proposed feature fusion method achieved a recognition accuracies and deep learning approaches like GoogLeNet, ResNet, ResNet-50, ResNeXt. This method's evident advantages signify a new strategy for the efficient and accurate identification of precancerous gastric cancer, with the potential to significantly aid in the identification of gastric precancerous diseases.

Convolutional Neural Networks are deep learning models widely utilized for image classification and computer vision tasks. Their hierarchical structure of interconnected layers enables automatic learning and extraction of relevant features from input images. Convolutional layers employ learnable filters or kernels to perform convolution operations on the input image, extracting local features like edges, textures, and shapes. Pooling layers down sample the input volume, reducing computational complexity in subsequent layers through operations like max pooling, which selects the maximum value within a defined window. To introduce non-linearity and learn complex patterns, activation functions like ReLU (Rectified Linear Unit) are commonly used in CNNs. At the end of the CNN, fully connected layers are responsible for classifying the learned features, connecting every neuron in the previous layer to every neuron in the current one. A variant of CNN called B-R-CNN combines the advantages of R-CNN with AdaBoost, a boosting method that enhances the precision of weaker classifiers. B-R-CNN trains multiple weak classifiers using R-CNN on various subsets of the training data and then employs AdaBoost to create a stronger classifier. During testing, this stronger classifier predicts the class of items in the image. B-R-CNN has been demonstrated to outperform the original R-CNN in terms of speed and accuracy. However, due to training several weak classifiers, it requires more computer resources during the training phase. Semantic segmentation is a critical computer vision task that involves classifying each pixel in an image into one of multiple predefined classes. Fully Convolutional Networks (FCNs) have emerged as a powerful architecture for semantic segmentation due to their ability to preserve spatial information while efficiently capturing features. Initialize the FCN model with pre-trained weights or random weights. The FCN architecture typically comprises an encoder part that captures hierarchical features and a decoder part that recovers spatial resolution. Forward pass the original image I through the FCN model to obtain the output logits or feature maps. The encoder processes the image through convolutional and pooling layers, resulting in a set of high-level feature maps capturing various levels of abstraction. Apply a softmax activation function to the logits to obtain class probabilities for each pixel. The softmax function normalizes the logits, providing pixel-wise class probabilities. This step transforms the network's output into a form interpretable as probabilities of each pixel belonging to different classes. Obtain the segmentation mask by assigning each pixel to the class with the highest probability. For each pixel, the class with the highest probability is selected, creating a segmentation mask. The purpose of the mask is to emphasize specific regions that correspond to distinct objects or classes present in the input image. When training a Fully Convolutional Network (FCN) for the task of semantic segmentation, the selection of an appropriate loss function is crucial. Typically, the cross-entropy loss is adopted, quantifying the disparity between predicted class probabilities and the true ground truth labels associated with each pixel. Through the process of backpropagation and gradient descent, the FCN's parameters, also known as weights, undergo iterative optimization. This involves adjusting the network's parameters in a manner that systematically reduces the chosen loss function. This optimization process aims to fine-tune the FCN's performance, ultimately leading to more accurate and effective semantic segmentation results.

3.1 Data Augmentation

Augmenting the training data with transformations like flipping, rotation, and scaling helps improve model generalization and robustness. Post-processing: Additional post-processing techniques, such as conditional random fields, can refine segmentation results by considering spatial relationships between pixels. Transfer Learning - Pre-training on large datasets, such as ImageNet, followed by fine-tuning on segmentation-specific data, can help accelerate convergence and improve performance. Fully Convolutional Networks (FCNs) have revolutionized semantic segmentation in computer vision. By capitalizing on their ability to capture spatial information and hierarchical features, FCNs have become a cornerstone in numerous applications, including autonomous driving, medical imaging, and remote sensing. The fusion of convolutional operations, pooling, upsampling, and softmax activation, coupled with optimization through backpropagation, enables FCNs to achieve accurate and robust semantic segmentation. As the field of computer vision advances, FCNs remain a fundamental tool for segmenting objects and regions within images, enhancing our understanding of visual data and facilitating groundbreaking research and applications.

Convolution operation: Z_i = Sum(W_k * X_i) + b_i ---------------(1)

Pooling-operation: Y_i = Max(X_i) --------------------------------(2)

Up-sampling operation: Y_i_upsampled = Upsample(Y_i) ------(3)

SoftMax activation_function: Soft-Max(Z _i ) = exp(Z _i ) / \(ϵ\)(exp(Z_j)) ----------(4)

Data augmentation is a powerful technique utilized in machine learning and computer vision to enhance the diversity and quality of training data. By applying various transformations to original images, data augmentation helps improve the robustness and generalization of machine learning models. A key aspect of data augmentation is its ability to handle noise, which can often be present in real-world scenarios. To effectively address noise, we can introduce random noise into images during the data augmentation process. The following algorithm outlines the steps for data augmentation with noise where I(x, y) represents the pixel value at coordinates (x, y) in the original image, and N(0, sigma^2) denotes Gaussian noise with a mean of 0 and variance of sigma^2 in Eq. 1. Dropout is a regularization technique that prevents over-reliance on specific neurons during training. It involves randomly deactivating neurons with a certain probability, which encourages the network to learn more robust and diverse features in Eq. 2. Here, Input(x) denotes the input to the dropout layer, Output(x) is the output after applying dropout, and Mask(x) is a binary mask with elements drawn from a Bernoulli distribution based on the probability of retaining a neuron. Weight decay, also known as L2 regularization, aids in controlling the magnitude of weights in a neural network. This technique adds a penalty term to the loss function, discouraging excessively large weight values in Eq. (3). In this formula, L_regularized (W) represents the regularized loss function, L(W) is the original loss function (e.g., cross-entropy loss), lambda is the regularization parameter, and ||W_i|| signifies the L2 norm of weight W_i represented in Eq. 1, 2 &3. By incorporating data augmentation with noise and applying regularization techniques like dropout and weight decay, CNN models can achieve greater resilience to noise within training data. This enhanced robustness leads to improved model performance and generalization, even in the presence of noisy input. This comprehensive approach equips machine learning models to handle real-world challenges and deliver reliable results across various applications.

I_augmented(x, y) = I(x, y) + N(0, sigma^2) ----------------(5)

Output(x) = Input(x) * Mask(x) —----------------------------(6)

L_regularized(W) = L(W) + (lambda / 2) * sum(||W_i||^2) –----(7)

Table 1

Different Features of Train Data
	id	class	segmentation
Count	115488	115488	33913
unique	38496	3	33899
top	case_123day20_slice_0001	Large bowel	12629, 10 12894, 12 13158, 15
frequency	3	38496	2

3.2. Optimization in Blur Techniques

Gaussian blur is a fundamental image processing technique used to reduce image noise and details, resulting in a smoother appearance. It's achieved by convolving the original image with a Gaussian kernel, which is a two-dimensional distribution centered around the origin in Eq. (4). This kernel emphasizes nearby pixels while gradually diminishing the influence of distant pixels. The size of the kernel, determined by parameter k, controls the extent of blurring, while the standard deviation sigma regulates the spread of the Gaussian distribution. This technique is particularly useful for preparing images before performing tasks like edge detection or object recognition. Motion blur is a phenomenon that occurs when capturing images of objects in motion or due to camera movement during exposure. This effect can be simulated or corrected using motion blur techniques. By convolving the original image with a motion blur kernel, the appearance of motion blur is introduced, replicating the effect of objects moving across the field of view. The kernel's size and orientation, controlled by parameters k and theta, influence the strength and direction of the blur. This technique is essential for various applications, including simulating realistic motion effects and restoring images affected by motion blur. Apply Wiener deconvolution to obtain the deblurred image in the frequency in Eq. (6) shows Inverse Fourier transform the deblurred image to obtain the deblurred image in the spatial domain. Deconvolution is a sophisticated technique used to recover the original image from a blurred or distorted version. In the context of image processing, it is particularly useful for restoring images that have been affected by various types of blurring, such as motion blur or Gaussian blur. Deconvolution algorithms aim to estimate the original image by considering the effects of the blur kernel and noise. The regularization parameter lambda controls the trade-off between preserving image details and reducing noise amplification during deconvolution. This process involves working in the frequency domain, utilizing the Fourier transforms of the blurred image and the blur kernel, and then applying inverse Fourier transform to obtain the deblurred image. Effective deconvolution techniques are crucial for enhancing image quality and aiding tasks like astronomical imaging, medical imaging, and forensic analysis. The techniques mentioned above, including Gaussian blur, motion blur, and deconvolution, can significantly impact image processing outcomes. An essential aspect of these techniques is their optimization to find the optimal parameters and regularization values. This optimization process involves experimenting with different kernel sizes, standard deviations, motion angles, and regularization parameters to achieve the best balance between removing unwanted artifacts and preserving image features. For CNN models, integrating these techniques can lead to improved performance in handling blurred images. By training CNNs on datasets containing both blurred and sharp images, models can learn to effectively distinguish and enhance relevant features while suppressing noise and blur effects. The models can then be fine-tuned using techniques such as transfer learning to adapt to specific tasks, such as image restoration or object detection, with better accuracy and robustness.

GK = (1 / (2 * pi * sigma^2)) * exp(-((x^2 + y^2) / (2 * sigma^2))) —(8)

MBK= (1 / k) * [cos(theta) cos(theta) ... cos(theta);

sin(theta) sin(theta) ... sin(theta);

cos(theta) cos(theta) ... cos(theta)] —--------------(9)

Deblurred Image (Frequency Domain): (1 / (FFT(K) + lambda)) * FFT(I_blurred) —-(10)

Table 2

Parameters of Keras Layers with Optimizer
Keras Layers	Input neurons	Activation
Conv2D	32(3,3)	Relu
Max Pooling	2,2	Relu
Conv2D	64, (3, 3)	Relu
Dense	10	Softmax
Optimizer	64(3,3)	Adam
Loss function	32(3,3)	Cross entropy
Trained images	70%	Relu, Softmax
Epochs	500	-

3.2 RCNN MODEL

The R-CNN model with boost Set up a collection of T weak R-CNN models with random initialization, each of which accepts an image as input and produces a list of object suggestions together with related class probabilities. Set each weak R-CNN model's weights to 1/N, where N represents the total number of weak classifiers. The weights wi of the weak R-CNN models should be normalized to total to 1 for each iteration t in the set T. Using the current weights wi and the base R-CNN model architecture, train each weak R-CNN model on a random portion of the training data.

For any unreliable R-CNN model:

i: Calculate the loss function Li to assess the system's performance on the training dataset.

Subtract the sum of Li weighted by wi to arrive at the error rate Ei.
As the best weak classifier, pick the weak R-CNN model with the lowest error rate.
t = 1/2 * ln((1 - Et) / Et)
Change all weak R-CNN models' weights w_i shown in Eq. 1.

where xi is an image, yi is its label, hit(xi) is the prediction made by the best weak classifier ht on image xi, and yi * hit(xi) is + 1 if the prediction is accurate and − 1 otherwise. The weights w_i should be normalized to equal 1. To create the strong R-CNN classifier H, combine all of the weak R-CNN models with their respective weights as shown in Eq. 2. Calculate the average precision (AP) and use it to assess how well the powerful R-CNN classifier H performed on the test dataset. By training weak R-CNN models on various subsets of the training data and modifying their weights based on their individual performance, the approach improves the performance of weak R-CNN models iteratively. The weak models' weights are combined to create a strong R-CNN classifier, which can handle challenging object detection tasks better. The AdaBoost technique is used to increase the weight of the better performing models while decreasing the weight of the weaker models.

wi = wi * exp(-t * yi * hit(xi)) ------------(11)

H(x) = sign(sum(wi * hi(x)) -------------- (12)

Table 3

Image Segmentation Values on GI Tract Images
segmentation	29.364956
size	33.333333
id	100.000000
class	100.000000
cas	100.000000
jour	100.000000
slice	100.000000
Dtype	float64

3.3 VGG model

The VGG architecture was presented by Karen Simonyan and Andrew Zisserman in 2013 as a contribution from Oxford's Visual Geometry Group (VGG). This model brought about a significant transformation in the field of computer vision. and was initially developed for the 2014 ImageNet Challenge. It differed from previous successful models, like AlexNet, in several ways. While AlexNet used a larger receptive field of 11x11 with a 4-pixel stride, VGG employed smaller 3x3 receptive fields with a 1-pixel stride, achieving a larger effective receptive field by combining these smaller filters. The VGG network is known for its simplicity, utilizing small convolutional filters throughout the architecture. VGG16, for instance, consists of 13 convolutional layers and 3 fully linked layers make up this. making it a 16-layer deep neural network. It has a substantial number of parameters, totaling 138 million, making it relatively large compared to contemporary standards. Despite its size, the key advantage of VGGNet16 lies in its simplicity, encompassing the fundamental characteristics of convolutional neural networks. It has become a foundational model in computer vision tasks due to its effective architecture and it was thoroughly used and adapted in different Application. The VGG networks apply small 3x3 receptive fields, which are the smallest possible. The VGG19 model shares the same basic idea as VGG16 but 19 levels are supported, showing the Weight-Layers (convolutional layers) present in the model. As a result, VGG19 has three additional convolutional layers compared to VGG16. The VGG architecture employs very small convolutional filters. 13 convolutional layers and three fully linked layers make up the VGG16. VGGNet accepts images with a 224x224 resolution. To maintain input size consistency during the ImageNet competition developers eliminated each image's Centre 224x224 patch. VGG networks consist of small convolution filters. Specifically, there are three fully linked layers and 13 convolutional layers in the VGG16. For a ImageNet competition, VGGNet takes a 224x224 image as input. To maintain a consistent image input size, the model's developers removed a 224x224 square from the center of each submitted image. The VGG convolutional layers play a key role in the network's overall architecture and success.

3.3 Conditional Invertible Neural Networks

A conditional invertible neural network is an advanced type of neural network architecture that builds upon the concept of invertible neural networks. An invertible neural network is designed in such a way that both the forward and backward transformations are invertible. This means that you can transform data from its original form to a learned representation and then back again without any loss of information. In conditional invertible neural networks, an additional conditioning input is incorporated into the network's architecture. This conditioning input serves as extra information that guides the transformation process. For example, in image generation tasks, the conditioning input could specify the attributes of the image to be generated, such as the pose of an object or the style of a painting. The network can then generate images with these specified attributes while maintaining invertibility, allowing the generated images to be accurately transformed back to their original attributes. The applications of conditional invertible neural networks are broad, ranging from generative modeling and data augmentation to style transfer and controlled transformation tasks. Intestinal gas, also known as flatulence, refers to the presence of gas in the digestive system. This gas can be produced by various factors, including swallowing air while eating, the breakdown of certain foods by gut bacteria, and fermentation processes in the intestines. While the presence of gas in the intestines is a normal part of digestion, excessive gas can lead to discomfort, bloating, and sometimes flatulence. It's important to note that there doesn't appear to be a direct connection between "conditional invertible neural networks" and "intestinal gas." They are concepts from very different domains: one is related to advanced neural network architectures, and the other pertains to digestive physiology. Algorithm for Computing Conditioned Reconstruction Statistics using Invertible Neural Networks Given a noisy measurement yδ, an invertible neural network F, and a conditioning network C, the algorithm aims to calculate the mean and variance of the conditioned reconstructions based on random samples. Start by obtaining the forward-backward projection (FBP) using TFBP(yδ), and denote it as c0. Apply the conditioning network C with parameters Θ to c0, resulting in conditioned outputs c. For each iteration k from 1 to K: a. Generate a random sample z[k] from a normal distribution N(0, I). b. Use the inverse of the invertible neural network F to compute the reconstruction xˆ[k] using z[k] and conditioned output c. After completing the loop, calculate the mean reconstruction by averaging all xˆ[k] values that is to Calculate the variance of the reconstructions by averaging the squared differences between each xˆ[k] and the mean xˆ represented in Eq. 3.

xˆ = (1 / K) * ∑ xˆ[k]

σˆ = (1 / K) * ∑ (xˆ[k] - xˆ)^2 ------------- (13)

Table 4

Segmentation Analysis according to Class Label
	id	class	segmentation
0	Slice_0001	large bowel	NaN
1	Slice_0001	small bowel	29.36
2	Slice_0001	stomach	33.33
3	Slice_0002	large_bowel	32.4
4	Slice_0002	small_bowel	29.36
5	Slice_0143	small_bowel	NaN
6	Slice_0143	stomach	32.65
7	Slice_0144	large_bowel	31.35
8	Slice_0144	small_bowel	25.67
9	Slice_0144	stomach	33.45

In this example, we build a straightforward CNN model using TensorFlow's Sequential API. The model has three convolutional layers with ReLU activation, followed by layers with maximum pooling. The resulting output undergoes flattening and is then directed through a pair of fully connected layers featuring ReLU activation functions. For multi-class classification, we apply a SoftMax activation in the bottom layer. The Adam optimizer and the loss function are used to create the model. It is defined as sparse categorical cross-entropy. To train the model, we load the CIFAR-10 dataset, normalize the pixel values, and then proceed with training on the training set for 10 epochs. In Fig. 1, a convolutional neural network (CNN) architecture is illustrated with seven layers, incorporating Rectified Linear Unit (ReLU) and softmax activation functions. Table-1 provides a detailed description of various features, presenting segmented values. After applying masque techniques, the segmented values are displayed in Table 2. Class labels are associated with different segmentations, as depicted in Tables 3 and 4 across various slices. Moving to Fig. 2, categorical class labels are visually represented through a bar graph. The CNN model, utilizing Convolutional 2D layers and Max Pooling layers, employs the Adam optimizer and cross-entropy loss function for training. A dataset of 255 images is trained over 10 epochs, with an input size of 64 units. Figure 3 showcases the application of masque techniques on gastrointestinal (GI) tract images, highlighting the distinctions between original and masque-processed images. Occurrences of the GI tract is visibly depicted in different regions of the stomach, as illustrated in Fig. 4. Conversely, non-masque images are presented in Fig. 6. The architecture of the VGG-16 model is elaborated in Fig. 5, outlining its diverse layers. Colored pixels and their corresponding values are depicted in Fig. 6. In Figs. 7, 8 and 9, pie charts and bar graphs respectively represent the percentages of GI tract images, providing a comprehensive visual overview.

Gastrointestinal tract tomography images have gained substantial momentum, thanks to the integration of cutting-edge deep learning techniques. These data-driven models have revolutionized image reconstructions, leading to heightened image quality. Our research significantly contributes to this field by orchestrating a comprehensive data challenge, meticulously evaluating diverse deep learning algorithms using expansive public datasets. Our primary emphasis lies in quantitatively assessing these methods. The outcomes of our investigation underscore the remarkable strides achieved by deep learning-based approaches, markedly improving reconstruction quality metrics. These advancements hold true for both computed tomography (CT) applications and innovative methods like Region-CNN (RCNN) and Conditional Invertible Neural Networks (CINN). Furthermore, we delve into pivotal selection criteria guiding these methodologies, encompassing factors such as the availability of training data, comprehension of the physical measurement model, and the speed of reconstruction. Regarding segmentation of three-dimensional tract images, our work spotlights the prevalent use of convolutional networks and Conditional Invertible Neural Networks. However, these advanced architectures, including CNNs, RNNs, and CINNs, place substantial computational demands, necessitating GPU-accelerated workstations for rapid inference. In response, this res earch work introduces a groundbreaking segmentation method, deploying a human-like strategy for 3D segmentation. This novel approach entails analyzing the image on a smaller scale to precisely identify areas of interest, followed by processing only pertinent feature-map patches. This innovative technique results in a significant reduction in inference time while maintaining state-of-the-art segmentation quality. Our research contributes to the ongoing evolution of gastrointestinal tract image analysis, offering practical solutions and innovative methods that push the boundaries of accuracy and efficiency.

Author Contributions: The authors confirm contribution to the paper as follows: study conception and design: PV, ST and MPVL; data collection: PSB, MDV and GSPG; analysis and interpretation of results: SA, EE, MO and MA; draft manuscript preparation: MA and BOS. All authors reviewed the results and approved the final version of the manuscript.

Funding: This research was financially supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2024R393), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia. The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University (KKU) for funding this work through the Research Group Program Under the Grant Number: (R.G.P.2/572/44).

Acknowledgments: This work was supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2024R393), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia. The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University (KKU) for funding this work through the Research Group Program Under the Grant Number: (R.G.P.2/572/44).

Data Availability Statement: The datasets used during the current study are available from the corresponding author on reasonable request.

Ethics approval and consent to participate: Not applicable.

Consent for publication: Not applicable.

Comporting interests: The authors declare no competing interests.

Greg P. Smestad, a, Cody Anderson b,c, Michael E. Cholette b, Pavan Fuke d, "Variability and associated uncertainty in image analysis for soiling characterization in solar energy systems", Solar Energy Materials & Solar Cells ,2023,259,112437, "https://doi.org/10.1016/j.solmat.2023.112437".
Jonathan Dayan,*,1 Noam Goldman,y,1 Daniel Waiger ,z Tal Melkman-Zehavi ,* Orna Halevy,* and Zehava Uni *,"A deep learning-based automated image analysis for histological evaluation of broiler pectoral muscle",Poultry Science (Publisher: Elsevier Inc. on behalf of Poultry Science Association Inc.),2023,102,102792, https://doi.org/10.1016/j.psj.2023.102792.
A.E. Ahmed, P.T. Mpangase, S. Panji, S. Baichoo, Y. Souilmi, F.M. Fadlelmola, M. Alghali, S. Aron, H. Bendou, E. De Beste."Organizing and running bioinformatics hackathons within Africa: the H3ABioNet cloud computing experience",AAS Open Research,2018,1,9,https://doi.org/10.12688/aasopenres.12847.2.
Rashid Mehmood Gondal, Saima Anwar Lashari,, Murtaja Ali Saare, Sari Ali Sari, "A hybrid de-noising method for mammogram images",Indonesian Journal of Electrical Engineering and Computer Science,March 2021 vol. 21, no. 3.
Sumanth S, A Suresh, "A Survey on Types of Noise Model, Noise and Denoising Technique in Digital Image Processing," International Journal of Innovative Research in Computer and Communication Engineering,April 2017,Vol.5, no. Special Issue 2, pp. 50–55.
Ying Chen, Yinyin Chen, Shuangshuang Fu, Wei Yin, Kanghan Liu, Shuyi Qian,"VGG16-based intelligent image analysis in the pathological diagnosis of IgA nephropathy",The Egyptian Society of Radiation Sciences and Applications (Publisher: Elsevier B.V.),2023,"https://doi.org/10.1016/j.solmat.2023.112437".
G. Ligabue, F. Pollastri, F. Fontana, "Evaluation of the classification accuracy of the kidney biopsy direct immunofluorescence through convolutional neural networks.",Clinical Journal of the American Society of Nephrology,2020,15,10,1445–1454.
Y. Su, D. Li, X. Chen, "Lung nodule detection based on the Faster R-CNN framework.", Computer Methods and Programs in Biomedicine,2021,200,105866.
Z. Zhang, J. Yang, A. Ho, W. Jiang, J. Logan, X. Wang, P. D. Brown, S. L. McGovern, N. Guha-Thakurta, S. D. Ferguson, X. Fave, L. Zhang, D. Mackin, L. E. Court, J. Li, "A predictive model for distinguishing radiation necrosis from tumor progression after gamma knife radiosurgery based on radiomic features from MR images.",European Radiology,2018,28,6,2255–2263.
Pratt, William K., "Digital Image Processing," Fourth Edition, Wiley, 2007.
Bishop, Christopher M., "Pattern Recognition and Machine Learning," Springer, 2006
Gonzalez, Rafael C., and Richard E. Woods, "Digital Image Processing," Fourth Edition, Pearson, 2018.
S. Che, C. Wang, C. Varga, S. Barbut, L. Susta,"Prevalence of breast muscle myopathies (Spaghetti meat, woody [4]breast, white striping) and associated risk factors in broiler chickens from Ontario Canada.",PLoS One,2022, 17, e0267019.
Paris, Sylvain, Pierre Kornprobst, and Jack Tumblin, "Bilateral Filtering: Theory and Applications," Foundations and Trends in Computer Graphics and Vision, 2009.
Zhou, H., Liu, Z., Li, T., Chen, Y., Huang, W., & Zhang, Z. (2022). "Classification of precancerous lesions based on fusion of multiple hierarchical features", ComputerMethods and Programs in Biomedicine, 229, Article 107301.
Y. Chen, C.-Y.Lien, H.-M. Chuang, "A Low-Cost VLSI Implementation for Efficient Removal of Impulse Noise," IEEE Trans. Very Large Scale Integration Systems,Mar. 2010, vol. 18, no. 3, pp. 473–481.
E. Afgan, D. Baker, B. Batut, M. van den Beek, D. Bouvier, M. Cech, J. Chilton, D. Clements, N. Coraor, B." "The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update", Nucleic Acids Research,2018,46,W537–W544, "https://doi.org/10.1093/nar/gky379".
M. Angelo, S.C. Bendall, R. Finck, M.B. Hale, C. Hitzman, A.D. Borowsky, R.M. Levenson, J.B. Lowe, S.D. Liu, S. Zhao, Y. Natkunam, G.P. Nolan,"Multiplexed ion beam imaging of human breast tumors.", Nature Medicine (Abbreviated as Nat. Med.),2014,13.1–13.11, "https://doi.org/10.5244/C.27.13".
Anuradha Nayak, Abhishek Verma, "A Review on Noise Types and Image Denoising Techniques," IJSRD - International Journal for Scientific Research & Development|, 2018,Vol. 5, no. Issue 11, pp. 43–44.
chih-yuan lien, chien chuan huang, pei-yin chen, yi-fan linieee, "An Efficient Denoising Architecture For Removal Of Impulse Noise In Images," transactions on computers, 4, april 2013,vol. 62.
Cheng Wang, Xiaoyi Zhang, and Qiang Wu, "Applications of Computer Vision in Agriculture: A Systematic Review", Journal of Field Robotics, 2017.
H. Denk, G. Syr´e, E. Weirich,"Immunomorphologic methods in routine pathology. Application of immunofluorescence and the unlabeled antibody-enzyme (peroxidase-anti peroxidase) technique to formalin-fixed paraffin-embedded kidney biopsies.", Beiträge zur Pathologie,1977,160,2,184–194.
Y. Guo, Y. Wang, H. Yang,"Dual-attention EfficientNet based on multi-view feature fusion for cervical squamous intraepithelial lesions diagnosis.",Biocybernetics and Biomedical Engineering,2022,42,2,529–542.
M. Kouroupis, N. Korfiatis, J. Cornford, "Artificial intelligence–assisted detection of diabetic retinopathy on digital fundus images: Concepts and applications in the National Health Service.", Innovation in Health Informatics,2020,261–278.
Zhao, B., Cheng, T., Zhang, X., Wang, J., Zhu, H., Zhao, R., Li, D., Zhang, Z., & Yu, G.(2023). "CT synthesis from MR in the pelvic area using Residual TransformerConditional GAN", Computerized Medical Imaging and Graphics, 103, Article 102150.
Tomasi, Carlo, and Roberto Manduchi, "Bilateral Filtering for Gray and Color Images," IEEE International Conference on Computer Vision, 1998.
W. Hu, C. Li, X. Li,"GasHisSDB: A new gastric histopathology image dataset for computer-aided diagnosis of gastric cancer.",Computers in Biology and Medicine,2022,142,Article 105207.

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

A Deep Dive into GI Tract Imaging Transformation through Advanced Image Segmentation Analysis with Deep Learning

Status:

Version 1

Abstract

Figures

I. Introduction:

II. LITERATURE SURVEY

III. METHODOLOGY

IV. Results and Discussions

V. Conclusion

Declarations

References

Additional Declarations

Status:

Version 1