Automatic Non-Invasive Prediction of Hemoglobin Using a Deep Learning-Assisted Smartphone-Based System

doi:10.21203/rs.3.rs-4168843/v1

Download PDF

Research Article

Automatic Non-Invasive Prediction of Hemoglobin Using a Deep Learning-Assisted Smartphone-Based System

https://doi.org/10.21203/rs.3.rs-4168843/v1

This work is licensed under a CC BY 4.0 License

You are reading this latest preprint version

Background

The context and purpose of this study is to introduce a novel, compact, and efficient system that leverages deep learning and smartphone technology to estimate hemoglobin levels. Accurate measurement of hemoglobin concentration is essential for various medical scenarios, including preoperative evaluations and determining blood loss. Current models, due to their complex parameters, are not well-suited for mobile medical settings, which limits the ability to conduct frequent and rapid testing.

Methods

The study employed a smartphone application to capture images of the eye, which were then analyzed by a deep neural network trained using invasive blood test data. For the task of eyelid segmentation, the EGE-Unet model was used. The performance of this model was evaluated using statistical metrics including mean intersection over union (MIOU), F1 Score, accuracy, specificity, and sensitivity. For hemoglobin level prediction, the DHA(C3AE) model was employed. The performance of this model was assessed using mean absolute error (MAE), mean absolute percentage error (MAPE), root mean square error (RMSE), and R^2.

Results

The EGE-Unet model demonstrated robust performance in eyelid segmentation, achieving a MIOU of 0.78, an F1 Score of 0.87, an accuracy of 0.97, a specificity of 0.98, and a sensitivity of 0.86. The DHA(C3AE) model for hemoglobin level prediction yielded promising outcomes with a MAE of 1.34, a MAPE of 2.85, an RMSE of 1.69, and an R^2 of 0.34. The overall size of the model is modest at 1.08M, with a computational complexity of 0.12 FLOPs (G).

Conclusions

This groundbreaking approach eliminates the necessity for supplementary devices, providing a cost-effective, swift, and accurate method for healthcare professionals to enhance treatment planning and improve patient care in perioperative environments. The proposed system has the potential to enable frequent and rapid testing of hemoglobin levels, which can be particularly beneficial in mobile medical settings.

Trial Registration:

The clinical trial was registered on the Chinese Clinical Trial Registry (No. ChiCTR2100044138) on March 11, 2021.

Non-invasive Prediction

Smartphone

Hemoglobin

Deep Learning

Automatic

Hemoglobin, a vital protein found in red blood cells, plays an essential role in oxygen transport within the body. This iron-rich pigment is crucial for ensuring cells receive the oxygen they need for proper function. Abnormalities in hemoglobin levels can impair the oxygen-carrying capacity of the blood, leading to a host of medical issues, such as anemia and cardiovascular diseases. Consequently, hemoglobin serves as a critical biomarker in the diagnosis and management of various health conditions.

Monitoring hemoglobin levels is of paramount importance in clinical practice for the diagnosis and treatment of conditions like anemia, evaluating the risk of cardiovascular diseases, conducting preoperative assessments, and evaluating blood loss. Traditional methods of measuring hemoglobin involve invasive blood tests, which can be discomforting and time-consuming for patients. Recent advancements have seen the development of non-invasive techniques to gauge hemoglobin levels[1, 2], including pulse oximetry and spectrophotometry. These methods, which measure oxygen saturation and the absorption of light by hemoglobin respectively, however, come with their limitations. Factors such as skin color, temperature, and motion artifacts can influence their accuracy.

The emergence of smartphone technology has paved the way for a novel approach to estimating hemoglobin levels. Leveraging deep learning algorithms, smartphone-based systems are now capable of predicting hemoglobin concentrations [3, 4]. These systems harness the power of smartphones' built-in sensors, such as cameras and flashlights, to capture skin images for analysis via deep learning algorithms.

Artificial intelligence (AI) has the potential to significantly enhance the precision and reliability of these sensors while simultaneously reducing detection costs and time. Given their convenience, accessibility, and cost-effectiveness, smartphone-based diagnostic methods are revolutionizing medical detection, offering clinicians improved diagnostic tools. This study introduces a pioneering system that utilizes smartphones, coupled with deep learning algorithms, to predict hemoglobin concentration accurately. By integrating clinical data from patients' eyelids and employing a streamlined network architecture, this system can ascertain hemoglobin levels without external equipment. Moreover, it achieves higher accuracy in real-time detection than manual assessments conducted by medical professionals. This advancement represents a significant stride toward more accessible and efficient healthcare diagnostics, promising enhanced patient care and outcomes.

Several approaches have been explored to determine hemoglobin (Hb) concentrations from blood specimens[5–9]. Traditional machine learning models, utilizing invasive methods, have been frequently employed for Hb concentration prediction[7–9]. While these models require blood samples obtained through venipuncture and rely on costly, specialized optical measurement equipment, they offer high precision in their results. In contrast, recent advancements have introduced non-invasive techniques for Hb prediction, leveraging image processing and deep learning. Notably, deep learning methodologies utilizing images of the fingertip and eye have been applied to categorize Hb levels[10, 11]. One particular study[12] achieved a rank order correlation of 0.93 between its predictions and actual Hb levels using fingertip videos processed by an Artificial Neural Network. Another research [13]effort utilized a Convolutional Neural Network (CNN) analyzing eye images for anemia classification, achieving an impressive accuracy rate of 94%. Additional techniques [14]for Hb level estimation have employed facial feature extraction alongside Inception V3 for classification, as well as the analysis of the near-infrared spectrum of spent dialysis fluid to predict anemia[15].

Despite these innovations, non-invasive quantification of Hb levels remains an area with limited exploration. A noteworthy [11] attempt involved quantifying Hb concentration through non-invasive means using fundus images for training, although this method presented challenges in data handling. Smartphone-based systems emerge as a promising solution offering rapid, convenient, and precise detection of Hb concentrations. The adaptation of existing models or the development of novel approaches to refine the segmentation process is crucial for predicting Hb concentrations on mobile devices, aiming for lightweight yet highly effective neural networks. Our research contributes to this burgeoning field by focusing on advanced segmentation techniques and the development of streamlined neural networks.

This study reports on the development of a deep learning-assisted system that predicts hemoglobin concentration using smartphones. Initially, our research team built a prediction model employing a two-stage approach that combined Mask-RCNN[16] for image segmentation and MobileNet for the prediction phase[17]. This model, trained on a dataset comprising 1,124 perioperative eyelid images of patients, demonstrated a mean absolute error (MAE) of approximately 1.5. In this paper, we introduce a mobile adaptation of our earlier findings, showcasing the potential of mobile devices in facilitating accessible and efficient hemoglobin concentration prediction.

2.1 Ethical Statement

The study protocol was approved by the institutional ethics committee of the First Affiliated Hospital of Third Military Medical University (also called Army Medical University, KY2021060), on February 20, 2021, and written informed consent was obtained from each patient. The clinical trial was registered on the Chinese Clinical Trial Registry (No. ChiCTR2100044138) on March 11, 2021. The principal researcher was Prof. Bin Yi.

2.2 Patient Recruitment and Image Collection

The patient recruitment and image collection phase was conducted at the First Affiliated Hospital of the Third Military Medical University in Chongqing, China, from March 18, 2021, to April 26, 2021. The study set forth specific inclusion criteria: willingness to participate in the research and capability to adhere to the study protocol; necessity for Arterial Blood Gas (ABG) analysis as part of routine clinical care; and a perioperative Hemoglobin variance exceeding 1.5 g/dL. Conversely, the exclusion criteria encompassed: refusal to participate; incapacity to cooperate due to mental health conditions; presence of eye diseases, exposure to eye or facial radiation therapy; affliction by carbon monoxide or nitrite poisoning, jaundice, or any condition affecting the conjunctiva color; or any other factor deemed by researchers to render a participant unsuitable for the study.

To facilitate patient enrollment, image capture, data collection, and image analysis, a standardized research methodology was established. The research team comprised eight members, each assigned specific roles: one for patient recruitment, two for capturing images, two for data collection and management, one for conjunctiva analysis, and two for quality assurance. Prior to the commencement of patient recruitment, all team members underwent training to familiarize themselves with the study's procedures, including the inclusion and exclusion criteria, techniques for conjunctiva exposure and image capture, and conjunctiva analysis standards.

On the day preceding surgery, eligible patients who consented to participate signed a written informed consent form. On the day of surgery, following ABG analysis, the designated team members proceeded to the operating room or the post-anesthetic care unit (PACU) to photograph the patients’ right and left facial profiles, ensuring standard conjunctiva exposure under the typical lighting conditions of the operating room and PACU. The interval between the ABG analysis and the image capture did not exceed 10 minutes. All photographs were taken with the patients in a supine position, using the rear camera of the same smartphone (20.00 megapixel and f/1.8 aperture) under identical settings. Simultaneously, two other team members recorded patient identifiers, gender, Hb levels, age, and other pertinent information.

At the end of each day, the data collection team reviewed the images to identify patients with Hb variations greater than 1.5 g/dL, discarding all unselected images permanently. The quality control team oversaw the entire process, ensuring the integrity of patient recruitment, image quality, and data accuracy throughout the study.

2.3 Workflow and Experimental Methodology

In this research, we innovated a smartphone-based solution capable of estimating hemoglobin levels through the application of deep learning. This system utilizes a smartphone application to capture eye skin images, which are subsequently analyzed by a deep neural network. This network has undergone training on a dataset comprising Hb measurements obtained via invasive blood tests. It employs features extracted from the skin images to forecast Hb concentrations. The system's workflow and the experimental setup are delineated below.

Fig.1 provides a schematic overview of our system's workflow and the study's experimental framework. The system encompasses an algorithm dedicated to eyelid segmentation and another algorithm designed for predicting Hb concentrations based on these values (refer to Fig.1). Leveraging deep learning technology, we accomplished swift and reliable detection of Hb levels in patients undergoing surgery.

To compile training datasets, we captured eyelid images from patients using various smartphone models. Data augmentation techniques were employed to enhance the deep learning method's accuracy and robustness. The efficacy of the trained model was assessed on novel datasets, with its precision being verified against a collection of 265 test samples. To further validate the model's accuracy, we conducted a comparative analysis involving two distinct experimental cohorts: one comprising human experts and the other utilizing the prediction model outlined in this paper, both evaluating the same set of 265 test images. Medical professionals estimated the Hb concentration range based on the patients' eye images and assessed the accuracy of their estimations. Conversely, the smartphone application processed the eye images through segmentation and subsequently forecasted the Hb values. The application then precisely determined the prediction error within a specified range.

This experimental design not only highlights the potential of mobile technology in medical diagnostics but also showcases the accuracy and efficiency of deep learning algorithms in predicting critical health markers such as hemoglobin levels.

2.4 Image Data Augmentation

To effectively train deep learning models, a substantial volume of training data is essential. Nonetheless, enlarging the training dataset poses a significant challenge. A practical approach to augment the volume of training data involves the reproduction of existing data. This process generates multiple images from a single source by randomly applying a combination of techniques illustrated in Fig.2, which includes: (A) color temperature adjustment, (B) contrast enhancement, (C) brightness alteration, (D) Gaussian blur application, (E) horizontal flipping, and (F) stochastic cropping and resizing. These augmentation techniques are selected to mirror the variety of conditions encountered when photos are captured using smartphones in real-world scenarios.

The rationale behind each technique is as follows:

- Techniques A, B, and C address the variability in color representation across different smartphone models, ensuring the model is not biased towards the color metrics of a specific device.

- Technique C also accounts for the diverse lighting conditions under which photos might be taken, ranging from dimly lit environments to brightly illuminated settings.

- Technique D introduces an element of blur to simulate photos taken out of focus, a common occurrence in hastily captured images.

- Techniques E and F are designed to mimic minor inaccuracies in framing and alignment that can occur during the photo capture process, ensuring the model can accurately process images despite slight imperfections.

By employing these data augmentation techniques, we not only increase the diversity of our training dataset but also enhance the robustness and generalizability of our deep learning model to accurately interpret images under a wide array of conditions typical of smartphone photography.

2.5 Model Optimization for Precise Eyelid Detection and Hemoglobin Concentration Prediction

The effectiveness of many AI-based diagnostic methods can significantly diminish when faced with variations in lighting conditions, camera angles, and other external influences. To counteract these challenges and enhance algorithmic performance, this study employed two distinct algorithms: 1) an Eyelid Semantic Segmentation Algorithm, and 2) a Prediction Algorithm based on color intensity analysis. The model's performance was assessed utilizing a deep learning framework, which was implemented directly within a smartphone application, as depicted in Fig.1.Initially, the Efficient Group Enhanced UNet (EGE-Unet) [18] was deployed to accurately identify the target eyelid regions. The success of the prediction algorithm was found to be closely tied to the precise localization of these regions of interest. Subsequently, the deep learning network was tasked with performing hemoglobin concentration predictions, through which the DHANet model emerged as the superior prediction model due to its exceptional accuracy.

In the concluding phase of model optimization, the DHANet model was selected as the definitive choice for executing highly accurate hemoglobin concentration predictions. This decision was based on its proven efficacy in diagnosing concentration levels accurately, thus underscoring the critical importance of both accurate eyelid segmentation and effective color intensity analysis in enhancing the performance of AI diagnostic tools, especially when operated in the variable and unpredictable environment of smartphone applications.

2.6 Deep Learning Model Architecture

This study introduces a deep learning model structured in two pivotal stages: the Region of Interest (ROI) cropping stage and the decision-making stage. This bifurcation stems from the observation that in diagnostic imaging, particularly as illustrated in Fig.1, the most informative content is often localized within a small area near the test line. Direct decision-making from the original, full-sized image is inefficient due to the disproportionate ratio of relevant information to the overall image size. This model draws inspiration from human diagnostic practices, where focus is typically narrowed to the test line area. By mimicking this approach—segregating the precise cropping of the test line area (ROI cropping stage) from the diagnostic analysis utilizing the test line's data (decision stage)—we aim to enhance learning efficiency.

The EGE-Unet[18] model, an advanced iteration of the traditional U-Net[19] designed to address challenges in medical image segmentation, is deployed during the ROI cropping phase. It incorporates two novel modules: the Group multi-axis Hadamard Product Attention module (GHPA) and the Group Aggregation Bridge module (GAB). The GHPA module facilitates the extraction of lesion information from various angles by grouping input features and applying Hadamard Product Attention operations across different axes, an idea inspired by the Multi-Head Self-Attention mechanism. Meanwhile, the GAB module merges semantic features and detail features across scales, alongside the masks generated by the decoder, through group aggregation. This integration enables the extraction of multi-scale information efficiently. The EGE-Unet model stands out for its segmentation accuracy, low parameter count, and computational simplicity, making it particularly suited for practical applications.

Fig.3 delineates the EGE-UNet's design, showcasing a U-shaped layout with symmetrical encoder-decoder components. The encoder is segmented into six stages, each characterized by varying channel numbers. The initial stages employ standard convolutions, while the latter ones utilize the GHPA for multi-perspective representation extraction. Each encoder-decoder junction incorporates the GAB, enhancing upon the simplistic Skip connections found in the original U-Net. Deep supervision is employed to facilitate mask predictions at multiple scales, contributing to the GAB inputs. These enhancements enable EGE-UNet to surpass previous methods in terms of segmentation efficacy while maintaining a reduced parameter and computational footprint. For an in-depth discussion on the GHPA and GAB modules, refer to reference [18].

The decision stage of this paper introduces a hemoglobin concentration prediction model that adopts a regression-based approach. Drawing inspiration from the miniaturized face detection Delta Age AdaIN (DAA) network[20], this method encodes age into binary form for input into a Transfer learning framework to capture continuous age-related feature information. The binary code mapping yields two groups of values corresponding to the mean and standard deviation of the comparison ages, respectively. The age decoder calculates the difference age, and the mean of all comparison and difference ages is utilized for age prediction. This methodology is adapted for the eyelid prediction stage, as depicted in Fig.4.

The architecture of the eyelid prediction system, as depicted in the diagram, leverages deep learning alongside binary encoding mapping technology, comprising four main components:

1. EyelidEncoder Module: This pivotal module transforms the eyelid image into a comprehensive feature vector, encapsulating essential characteristics of the eyelid. For this purpose, the C3AE network[21] is employed due to its efficiency and compactness, making it particularly suitable for deployment on mobile platforms.

2. Delta Hemoglobin AdaIN (DHA): The DHA component is instrumental in estimating hemoglobin concentrations by juxtaposing the current image against a repository of images representing a spectrum of hemoglobin levels. It facilitates hemoglobin concentration prediction by evaluating the feature discrepancies across images.

3. Binary Encoding Mapping Module: Given that hemoglobin concentration variation is a continuous and gradual phenomenon, an 8-bit binary code is utilized to encapsulate the range of hemoglobin concentrations. This method employs binary encoding to transform the continuous spectrum of hemoglobin levels into a discrete, yet seamless, binary representation, enhancing the model's efficiency and interpretability.

4. EyelidDecoder Module:Acting as the final step in the prediction pipeline, the EyelidDecoder module interprets the outputs from both the EyelidEncoder and the binary encoding mapping modules. Utilizing this consolidated information, it accurately predicts the patient's hemoglobin concentration levels.

In refining the DHA prediction model, the performance of the EyelidEncoder module was optimized by evaluating the resnet18 network in comparison to the original c3ae model. Additionally, a range of widely recognized mobile image processing architectures—MobileNet[22], MobileNetV2[23], MobileNetV3[24], Shufflenetv2[25], Squeezenet[26], Wideresnet[27], Resnet18_CBAM[28], PFLD[29], and BCNN—were analyzed for their applicability. Notably, the PFLD model is recognized for its compact structure, suitable for age prediction, while the BCNN, a simple 5-layer convolutional network, was developed in-house. The efficacy of these models was assessed using various metrics, including Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and R-Squared (R2), to ensure a comprehensive evaluation of model performance.

2.7 Experiments on the Server

The experiments were conducted using the open-source PyTorch learning framework and programmed in Python. The hardware setup for these experiments was hosted on a Dawning workstation at the Chongqing Institute of Green and Intelligent Technology, part of the Chinese Academy of Sciences. This setup boasted dual NVIDIA 3090 graphics cards, each with 11 GB of memory, and ran on a 64-bit Ubuntu 16.04 operating system.

2.8 Model Porting and Mobilization

A smartphone application for Android systems was developed to facilitate hemoglobin concentration estimation directly from eyelid images. As depicted in Fig.5, the mobile application is divided into two main sections: sampling detection and case management.

Sampling Detection Section:

- Photo-taking Functionality: Users can capture images using both the front and rear cameras of their device. The application features an interface with a target detection box to guide users in framing the eyelid within the photograph. Alternatively, users can select existing images from their photo album for analysis.

- Eye Area Image Display: Captured images of the eye area are displayed through the application's interface for review and further processing.

- Hemoglobin Concentration Recognition: The application employs the developed model to analyze the selected eye area images, determining hemoglobin concentration levels and highlighting specific regions associated with these levels through mask areas.

- Result Display Function: The detected eyelid area, mask area, and the calculated hemoglobin concentration values are presented to the user, enabling easy visualization and understanding of the results.

Case Management Section:

- Users have the capability to store detection outcomes and enter patient details to create a new case file. Future sampling for the same patient can be added and linked to the existing case, allowing for monitoring of hemoglobin level changes over time.

The model porting process to a mobile platform involves several technical steps. Initially, the segmentation and prediction models are converted to the ONNX format for broader compatibility. Subsequently, model invocation is handled via OpenCV, with inference code crafted in C++. This inference logic is then packaged through NDK cross-compilation, leading to the creation of an SDK with a standardized C interface. The final application interface and related logic are developed in Android Studio, enabling the SDK to perform predictions through a conventional C interface. This comprehensive approach ensures the seamless integration of sophisticated deep learning models into user-friendly mobile applications, enhancing accessibility and utility for end users.

3.1 Image Pre-processing

In this study, we enrolled 284 patients scheduled for elective surgery, including 117 males and 167 females, and collected a total of 1273 eye images. After excluding three images due to inadequate exposure and five due to overexposure, we were left with a dataset of 1265 images. This dataset was divided into 1000 images for training and 265 images for testing. The hemoglobin concentration distribution in the training dataset closely mirrored that of the test dataset, ensuring consistency in model evaluation.

3.2 Segmentation Experiment

For the segmentation task, we utilized the EGE-Unet model, applying multi-scale training to the training set with a batch size of 64. The Stochastic Gradient Descent (SGD) optimizer was used, setting the initial learning rate (LR) at 0.01 over 300 epochs. A StepLR decay strategy was employed, halving the LR at the 16th epoch and doubling it at the 20th epoch. The results of this segmentation experiment are illustrated in Fig.6, comparing the performance of the proposed model against YOLOv8[30] and Mask RCNN.

The outcome of this experiment, as detailed in Table 1, revealed that EGE-UNet surpassed the comparative models, achieving Mean Intersection Over Union (MIOU) of 0.78, an F1 Score of 0.87, an accuracy of 0.97, a specificity of 0.98, and a sensitivity of 0.86. Furthermore, Table 2 highlights the efficiency of EGE-UNet: the model parameters totaled only 0.05M, with a computational requirement of merely 0.08 FLOPs (G), and an overall model size of just 0.5M. This performance was on par with that of YOLOv8 and Mask RCNN; however, EGE-UNet's model size was significantly more compact, being 30 times smaller than YOLOv8 and 3000 times smaller than Mask RCNN. Additionally, it required 20 and 50 times less computational complexity, respectively, marking a substantial improvement in model efficiency and practical applicability.

3.3 Prediction Experiment

In the predictive analysis phase, the Delta Hemoglobin AdaIN (DHA) model was employed as the foundation for training the prediction model. A crucial aspect of this experiment involved adjusting the EyelidEncoder module to evaluate the performance of the resnet18 network against the c3ae model. Concurrently, this study also benchmarked popular mobile image processing models against the DHA model to ascertain their relative performance. The findings from this comparative analysis are summarized in Table 3.

From the data presented in Table 3 and the visual representation in Fig.7, it is evident that the most effective model configuration was attained when employing C3AE as the EyelidEncoder. This configuration yielded impressive outcomes, characterized by a Mean Absolute Error (MAE) of 1.34, a Mean Absolute Percentage Error (MAPE) of 2.85, a Root Mean Square Error (RMSE) of 1.69, and an R-squared (R2) value of 0.34. Furthermore, the model was distinguished by its minimal parameter size of only 0.05M and a remarkably low computational demand of 0.04G FLOPs.

Comparatively, when the EyelidEncoder was configured with resnet18, the model's performance metrics were analogous to those achieved with C3AE. However, the resnet18 configuration necessitated model parameters that were 226 times larger and incurred a computational complexity 59 times greater than that of C3AE. Although mobile network models such as MobileNet, MobileNetV2, MobileNetV3, and Squeezenet demonstrated slightly less optimal performance than the C3AE network, they were characterized by significantly larger parameters and higher computational complexity, with the exception of the Squeezenet2 model. Notably, Squeezenet2 achieved an MAE comparable to that of c3ae, yet required model parameters 14 times larger and a computational complexity 81 times greater.

Moreover, the analysis revealed that convolutional networks designed on a simplistic framework exhibited parameter sizes and computational complexities akin to the C3AE network. Nevertheless, their performance metrics did not measure up to those of the C3AE model, underscoring the superior efficiency and efficacy of the C3AE-based EyelidEncoder configuration in hemoglobin concentration prediction tasks. This study highlights the importance of selecting an appropriate EyelidEncoder to balance model performance with operational efficiency, especially in applications intended for mobile platforms.

3.4 Clinical Sample Test Evaluation

A blind testing approach was utilized to assess the efficacy of the proposed model in predicting hemoglobin concentrations within clinical samples, as depicted in Fig.8 and Fig.9.

Fig.9 highlights the functionality of a mobile application specifically developed for this purpose. The application's workflow encompasses capturing an image of the patient's eyelid, identifying the relevant eyelid area for segmentation, and subsequently predicting the hemoglobin concentration. The results of this practical application, juxtaposed with the evaluations made by two medical experts and two junior doctors, are showcased in Table 4 and Fig.10. Prior to the assessment, both the experts and junior doctors were familiarized with 1000 trained images, which they used as a reference for evaluating the hemoglobin concentrations based on patients' eyelid images.

According to Table 4, the optimal Mean Absolute Error (MAE) estimated by the medical experts was 1.73, while the mobile application demonstrated a significantly improved accuracy with an MAE of 1.33, marking a 23% enhancement in accuracy over the experts' predictions. Additionally, the distribution of prediction errors by the app closely mirrored that of the experts, with minimal estimation biases observed in patients with normal values and larger biases in patients exhibiting abnormal values at both extremes of the spectrum. Conversely, the MAE observed in junior doctors was 2.64, reflecting a notable deficit in prediction accuracy attributable to their comparative lack of experience.

Error metrics detailed in Table 5 range from 0 to 2.0, reinforcing the reliability of predictions made by both the mobile application and the experts. As delineated in Table 5 and Fig.11, the mobile application achieved a 60% accuracy rate at an error range of 1.5, surpassing the experts' accuracy rate of 55%. In contrast, the accuracy rate of junior doctors was notably lower, plummeting to 48% at an error range of 2.0. These results underscore the potential of the mobile application to significantly aid physicians in enhancing the real-time accuracy of hemoglobin concentration predictions in patients, thereby contributing to more informed and effective clinical decision-making.

In this project, we developed an innovative smartphone application for predicting hemoglobin concentration, utilizing a sample-to-hemoglobin concentration strategy coupled with deep learning to facilitate decision-making. This application employs patient eyelid images captured on smartphones and processes these images using a micro-network framework, achieving results with minimal computational demand, at only 0.12 FLOPs(G). The diagnostic accuracy of the application shows promise for enhancement through training with additional clinical data. Its efficacy and versatility were confirmed through extensive testing across multiple users and prediction models, establishing its utility as a tool for detecting hemoglobin concentration.

Our approach, which leverages mobile phone-captured eyelid photos for hemoglobin level estimation, presents a convenient and cost-effective method compared to previous studies[11]. Unlike other research focusing on conjunctival judgment for anemia diagnosis[13, 31], our model offers quantitative hemoglobin level detection through non-invasive conjunctival examination. A pivotal element of our research is the blind test that juxtaposes the mobile app's predictions against those of medical experts. The superior accuracy of our system, as indicated by a lower MAE, underscores its potential value to healthcare professionals, particularly in urgent care scenarios like intraoperative massive bleeding where rapid, non-invasive hemoglobin detection is critical.

The application of deep neural networks and their variants in predicting hemoglobin concentration from eyelid images[17, 32] is well-documented. However, the extensive parameters and computational complexities of these models limit their suitability for mobile medical applications. To overcome these challenges, our study introduces a two-pronged solution. Firstly, we employ the UNet (EGE-UNet) for precise segmentation of the patient's eyelid region. The EGE-UNet, integrating the lightweight Group multi-axis Hadamard Product Attention (GHPA) module and Group Aggregation Bridge (GAB) module, excels in extracting and integrating multi-scale information for accurate eyelid segmentation. This model's efficacy and efficiency, as evidenced by its performance against alternatives like YOLOv8 and Mask RCNN and its low computational requirements, make it ideal for smartphone deployment. Secondly, the Delta Hemoglobin AdaIN (DHA) operation is utilized for deriving a representative eyelid image indicative of the patient's hemoglobin concentration through transfer learning. The DHA, a compact yet potent feature learning network, leverages binary encoding to ensure the continuity of feature information, thereby enhancing prediction accuracy.

Our findings reveal the smartphone-based system's capability, assisted by deep learning algorithms, to non-invasively predict hemoglobin levels with promising accuracy. While promising, this study acknowledges limitations such as the relatively small sample size and the necessity for further validation across larger datasets.

In summary, our research demonstrates the feasibility of non-invasively predicting hemoglobin levels using a deep learning-assisted smartphone application. This breakthrough has the potential to transform hemoglobin measurement practices, offering a more efficient and patient-friendly alternative for diagnosis and monitoring of various health conditions. Future studies are essential to validate our results further and refine the system for widespread clinical application.

Ethics approval and consent to participate

Consent for publication

Written informed consent for publication of their clinical images was obtained from the patient.

Availability of data and materials

The datasets used and analyzed during the current study are available from the corresponding author on reasonable request.

Code availability

The overall source codes used in this study are available at: (https: https://github.com/keyan2017/ANPHSP).

Competing interests

The authors declare that they have no competing interests.

Funding

This work was supported in part by National Natural Science Foundation of China under Grant 62371438,National Key R&D Program of China (No. 2018YFC0116702), National Science Foundation of China (No. 82070630); Chongqing Talents Project (No. CQYC202103080), Natural Science Foundation of Chongqing (Key Project) (No: CSTB2023NSCQ-ZDJ0005), Chongqing Municipal Natural Science Foundation under Grant CSTB2022NSCQ-MSX0894, and also in part by Youth Innovation Promotion Association of Chinese Academy of Sciences under Grant 2020377.

Authors’ contributions

All authors contributed to the study conception and design. Conceptualization: Bin Yi, Yuwen Chen; Design of the work: Yuwen Chen, Xiang Liu, Bin Yi; Acquisition and analysis of data: Yiziting Zhu, Xiaoyan Hu, Xiang Liu; Writing - original draft preparation: Yuwen Chen, Yiziting Zhu; Writing - review and editing: Xiaoyan Hu, Bin Yi; Funding acquisition: Bin Yi, Yuwen Chen.

Acknowledgements

Not applicable.

Authors’ information

Author: Yuwen Chen, Email: [email protected]

Author: Xiaoyan Hu, Email: [email protected]

Author: Yiziting Zhu, Email: [email protected]

Author: Xiang Liu, Email: [email protected]

Author: Bin Yi, Email: [email protected]

Arai Y, Shoji H, Awata K, Inage E, Ikuse T, Shimizu TJPR: Evaluation of the use of non-invasive hemoglobin measurement in early childhood. 2023, 93(4):1036-1040.
Man J, Zielinski MD, Das D, Wutthisirisart P, Pasupathy KSJJobi: Improving non-invasive hemoglobin measurement accuracy using nonparametric models. 2022, 126:103975.
Yang Y, Xu F, Chen J, Tao C, Li Y, Chen Q, Tang S, Lee HK, Shen WJB, Bioelectronics: Artificial intelligence-assisted smartphone-based sensing for bioanalytical applications: A review. 2023:115233.
Huang B, Kang L, Tsang VT, Lo CT, Wong TTJb: Deep Learning Assisted Smartphone-based Quantitative Microscopy for Label-free Hematological Analysis. 2023:2023.2001. 2024.525176.
Shinar S, Shapira U, Maslovitz S: Redefining normal hemoglobin and anemia in singleton and twin pregnancies. Int J Gynaecol Obstet 2018, 142(1):42-47.
Singh BG, Duggal L, Jain N, Chaturvedi V, Patel J, Kotwal J: Evaluation of reticulocyte hemoglobin for assessment of anemia in rheumatological disorders. Int J Rheum Dis 2019, 22(5):815-825.
Martínez-Martínez JM, Escandell-Montero P, Barbieri C, Soria-Olivas E, Mari F, Martínez-Sober M, Amato C, Serrano López AJ, Bassi M, Magdalena-Benedito R et al: Prediction of the hemoglobin level in hemodialysis patients using machine learning techniques. Comput Methods Programs Biomed 2014, 117(2):208-217.
Dejene BE, Abuhay TM, Bogale DS: Predicting the level of anemia among Ethiopian pregnant women using homogeneous ensemble machine learning algorithm. BMC Med Inform Decis Mak 2022, 22(1):247.
Vohra R, Hussain A, Dudyala AK, Pahareeya J, Khan W: Multi-class classification algorithms for the diagnosis of anemia in an outpatient clinical setting. PLoS One 2022, 17(7):e0269685.
Collings S, Thompson O, Hirst E, Goossens L, George A, Weinkove R: Non-Invasive Detection of Anaemia Using Digital Photographs of the Conjunctiva. PLoS One 2016, 11(4):e0153286.
Zhao X, Meng L, Su H, Lv B, Lv C, Xie G, Chen Y: Deep-Learning-Based Hemoglobin Concentration Prediction and Anemia Screening Using Ultra-Wide Field Fundus Images. Front Cell Dev Biol 2022, 10:888268.
Hasan MK, Haque MM, Adib R, Tumpa JF, Begum A, Love RR, Kim YL, Sheikh IA: SmartHeLP: Smartphone-based Hemoglobin Level Prediction Using an Artificial Neural Network. AMIA Annu Symp Proc 2018, 2018:535-544.
Magdalena R, Saidah S, Ubaidah IDwS, Fuadah YN, Herman N, Ibrahim N: Convolutional neural network for anemia detection based on conjunctiva palpebral images. Jurnal Teknik Informatika (JUTIF) 2022, 3(2):349-354.
Zhang A, Lou J, Pan Z, Luo J, Zhang X, Zhang H, Li J, Wang L, Cui X, Ji B et al: Prediction of anemia using facial images and deep learning technology in the emergency department. Front Public Health 2022, 10:964385.
Matović V, Jeftić B, Trbojević-Stanković J, Matija L: Predicting anemia using NIR spectrum of spent dialysis fluid in hemodialysis patients. Sci Rep 2021, 11(1):10549.
He K, Gkioxari G, Dollár P, Girshick R: Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision: 2017; 2017: 2961-2969.
Chen Y, Zhong K, Zhu Y, Sun Q: Two-stage hemoglobin prediction based on prior causality. Frontiers in public health 2022, 10:1079389.
Ruan J, Xie M, Gao J, Liu T, Fu YJapa: EGE-UNet: an Efficient Group Enhanced UNet for skin lesion segmentation. 2023.
Ronneberger O, Fischer P, Brox T: U-net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18: 2015: Springer; 2015: 234-241.
Chen P, Zhang X, Li Y, Tao J, Xiao B, Wang B, Jiang Z: DAA: A Delta Age AdaIN operation for age estimation via binary code transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition: 2023; 2023: 15836-15845.
Zhang C, Liu S, Xu X, Zhu C: C3AE: Exploring the limits of compact model for age estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition: 2019; 2019: 12587-12596.
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam HJapa: Mobilenets: Efficient convolutional neural networks for mobile vision applications. 2017.
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition: 2018; 2018: 4510-4520.
Howard A, Sandler M, Chu G, Chen L-C, Chen B, Tan M, Wang W, Zhu Y, Pang R, Vasudevan V: Searching for mobilenetv3. In: Proceedings of the IEEE/CVF international conference on computer vision: 2019; 2019: 1314-1324.
Ma N, Zhang X, Zheng H-T, Sun J: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European conference on computer vision (ECCV): 2018; 2018: 116-131.
Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer KJapa: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. 2016.
Zagoruyko S, Komodakis NJapa: Wide residual networks. 2016.
Woo S, Park J, Lee J-Y, Kweon IS: Cbam: Convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV): 2018; 2018: 3-19.
Guo X, Li S, Yu J, Zhang J, Ma J, Ma L, Liu W, Ling HJapa: PFLD: A practical facial landmark detector. 2019.
Reis D, Kupec J, Hong J, Daoudi AJapa: Real-Time Flying Object Detection with YOLOv8. 2023.
Jain P, Bauskar S, Gyanchandani M: Neural network based non‐invasive method to detect anemia from images of eye conjunctiva. International Journal of Imaging Systems and Technology 2020, 30(1):112-125.
Çuvadar B, Yılmaz H: Non-invasive hemoglobin estimation from conjunctival images using deep learning. Medical Engineering & Physics 2023, 120:104038.

No competing interests reported.

Download PDF

Editorial decision: Revision requested
03 May, 2024
Reviews received at journal
27 Apr, 2024
Reviews received at journal
26 Apr, 2024
Reviews received at journal
21 Apr, 2024
Reviews received at journal
21 Apr, 2024
Reviewers agreed at journal
14 Apr, 2024
Reviewers agreed at journal
14 Apr, 2024
Reviewers agreed at journal
14 Apr, 2024
Reviewers agreed at journal
14 Apr, 2024
Reviewers agreed at journal
10 Apr, 2024
Reviewers agreed at journal
09 Apr, 2024
Reviewers invited by journal
04 Apr, 2024
Editor invited by journal
01 Apr, 2024
Editor assigned by journal
01 Apr, 2024
Submission checks completed at journal
01 Apr, 2024
First submitted to journal
26 Mar, 2024

You are reading this latest preprint version

Automatic Non-Invasive Prediction of Hemoglobin Using a Deep Learning-Assisted Smartphone-Based System

Status:

Version 1

Abstract

Background

Methods

Results

Conclusions

Trial Registration:

Figures

Introduction

Related Work

Methods

2.1 Ethical Statement

2.2 Patient Recruitment and Image Collection

2.3 Workflow and Experimental Methodology

2.4 Image Data Augmentation

2.5 Model Optimization for Precise Eyelid Detection and Hemoglobin Concentration Prediction

2.6 Deep Learning Model Architecture

2.7 Experiments on the Server

2.8 Model Porting and Mobilization

Results

3.1 Image Pre-processing

3.2 Segmentation Experiment

3.3 Prediction Experiment

3.4 Clinical Sample Test Evaluation

Discussion

Conclusion

Declarations

Ethics approval and consent to participate

Consent for publication

Availability of data and materials

Code availability

Competing interests

Funding

Authors’ contributions

Acknowledgements

Authors’ information

References

Additional Declarations

Status:

Version 1