High-Fidelity Pose Estimation for Real-Time Extended Reality (XR) Visualization for Cardiac Catheterization

doi:10.21203/rs.3.rs-4645065/v1

Download PDF

Article

High-Fidelity Pose Estimation for Real-Time Extended Reality (XR) Visualization for Cardiac Catheterization

https://doi.org/10.21203/rs.3.rs-4645065/v1

This work is licensed under a CC BY 4.0 License

You are reading this latest preprint version

Extended reality (XR) technologies are emerging as promising platforms for medical training and procedural guidance, particularly in complex cardiac interventions. This paper presents a high-fidelity methodology to perform real-time 3D catheter tracking and visualization during simulated cardiac interventions. A custom 3D-printed setup with mounted cameras enables biplane video capture of a catheter. A computer vision algorithm processes the biplane images in real-time to reconstruct the 3D catheter trajectory represented by any designated number of points along its length. This method accurately localizes the catheter tip within 1 mm and can reconstruct any arbitrary catheter configuration. The tracked catheter data is integrated into an interactive Unity-based scene rendered on the Meta Quest 3 headset. The visualization seamlessly combines a reconstructed 3D patient-specific heart model with the dynamically tracked catheter, creating an immersive extended reality training environment. Our experimental study, involving six participants, demonstrated that the 3D visualization provided by the proposed XR system significantly outperformed 2D visualization in terms of speed and user experience. This suggests that the XR system has the potential to enhance catheterization training by improving spatial comprehension and procedural skills. The proposed system demonstrates the potential of XR technologies to transform percutaneous cardiac interventions through improved visualization and interactivity.

Physical sciences/Engineering/Biomedical engineering

Health sciences/Cardiology/Interventional cardiology

Percutaneous cardiac intervention

Extended Reality

Mixed Reality

Computer Vision

3D visualization

Percutaneous cardiac procedures demand precision in manipulating catheters through blood vessels and heart chambers for access in many cardiovascular conditions such as coronary artery disease, valvular heart disease, and structural heart defects ^1–6. The current methods, relying on fluoroscopy and echocardiography imaging, fall short in providing a comprehensive view of the catheter's dynamic position and orientation within the patient, necessitating significant mental reconstruction of the anatomical context ^7,8. As the field of complex catheter-based structural heart interventions advances, there is a pressing need for systems that offer enhanced visualization and guidance. This paper delves into the technical challenges required to provide real-time catheter tracking to an extended reality (XR) system with high speed and accuracy. Such capabilities will have applications for various use in percutaneous transcatheter cardiac interventions training ⁹, pre-procedural planning, intra-operative guidance ¹⁰, and Telesurgery to enable physicians to remotely maneuver catheters ¹¹. Conventional training techniques, anchored in fluoroscopy-guided observations and hands-on practice, pose challenges such as high costs, radiation exposure, and limited spatial comprehension for trainees ^12–14. The limitation of 2D visualization in representing the catheter's 3D location prompts the exploration of innovative solutions. XR, by combining 3D visualizations with real procedural tools, such as the actual catheter used in clinical practice, offers an unparalleled advantage in providing trainees with natural interactions and psychomotor skill development closer to live cases ^15–17, albeit conventional system have yet to provide XR visualization.

Our work focuses on introducing a cutting-edge XR-based 3D visualization system centered on the Meta Quest 3 platform. By leveraging dual-cameras and video processing to track the catheter's 3D position and shape, our system addresses the limitations in depth perception of 2D fluoroscopic views. The real-time 3D catheter reconstruction, seamlessly integrates into an interactive environment, designed to enhance visualization, and offers a more immersive training experience compared to existing methods ¹⁸. The use of XR, with its ability to replicate real procedural tools, ensures that operators can develop psychomotor skills in a manner closely resembling live cases, a feat not achievable with virtual reality (VR) alone. This study aligns with the broader trend of immersive technologies gaining prominence in medical training, particularly in the realm of cardiac interventions ¹⁹. These advancements go beyond mere technical innovation, addressing critical concerns such as radiation exposure, cost-effectiveness, and the need for standardized training environments ²⁰. The integration of XR technologies into percutaneous transcatheter training not only presents a leap forward in skill acquisition but also offers a safer and more accessible approach for trainees.

The outcomes of this work hold implications not only for the field of cardiac interventions, but also for medical training as a whole. The establishment of a comprehensive understanding of the benefits and challenges associated with 3D visualization in MR-based systems contributes to the ongoing discourse on optimizing simulation technologies. Our innovation introduces a novel approach to an XR-based cardiac catheterization system, incorporating the Meta Quest 3 and real-time 3D catheter reconstruction derived from a physical catheter manipulated by the user. By generating a precise 3D model of the catheter seamlessly within a patient-specific rendering of the heart within the Meta Quest 3, our system strives to offer enhanced visualization and immersive training experiences compared to existing methods. The unique multi-view perspectives provided by our system has the potential to lower learning curved for developing dexterity for new catheter systems and to prepare for specific cases with complex anatomies.

The proposed method involves designing and building a real-time catheter tracking system compatible with the commercial Meta Quest 3 XR headset. This system will be suitable for use with a variety of commercially available catheters (Fig. 1-a). The hardware component utilizes a 3D-printed frame to secure the catheter within a region of interest (boundary of heart), simultaneously recorded by two orthogonally positioned cameras (Fig. 1-b). A custom-designed computer vision (CV) algorithm (Fig. 1-c) infers the catheter's shape and orientation from the biplane views (Fig. 1-d, e). This data is used to reconstruct the catheter's 3D shape (Fig. 1-f) and transmitted in real-time to the XR headset. The headset leverages the Unity game engine to provide a stand-alone rendering environment (Fig. 1-i). This system will facilitate a specific set of tasks related to precise catheter positioning. The reconstructed catheter data is co-registered and visualized within a patient-specific anatomical heart model (Fig. 1-h) generated from a cardiac computed tomography (CT) scan acquired at end-diastole DICOM format (Fig. 1-g). This integration occurs within the Meta Quest 3 (Fig. 1-j). Users can physically manipulate a real commercial catheter (identical to those used in a cath lab) and maneuver it in real-time within the 3D patient heart model, observing the process via the Meta Quest 3 XR headset. The following sections provide a detailed technical description of each development stage.

2.1. Design and Fabrication of the 3D Printed Setup

To facilitate tracking of the catheter and establish a physical platform for controlled catheter manipulation, we designed a cubic 3D model using Dassault Systems SolidWorks 2022 and fabricated it using a 3D printer (Fig. 2). As illustrated in Fig. 2, several components were integrated into the 3D setup to enable accurate computer vision tracking of the catheter. A removable inlet with a 4 mm diameter, centered at (0,0,0) in our 3D Cartesian coordinate system, is included for catheter insertion and articulation into the central open space within the cube. This inlet can be easily exchanged to accommodate catheters of varying thickness. The setup contains mounts designed to securely hold two cameras at specific locations along two sides of the cube, forming an orthogonal biplane imaging system. From these fixed positions, the two cameras capture video of the catheter movement through the central tracking region of interest (ROI).

To specify the ROI, eight fiducial markers (m_Ti and m_Fi for i = 1,2,3, and 4, T and F denotes Top, and Front respectively) were 3D printed onto specific locations within the model on two cross-shaped pillars, ensuring that the ROI can accommodate the size of a human heart. The fiducial markers are in two orthogonal planes, allowing the transformation of each camera’s perspective to a global coordinate system. This coordinate transformation is essential in reconstructing the 3D catheter trajectory within the 3D model workspace, ensuring that heart is centered in the ROI and the inlet of both the cube and heart are aligned. The cubic planes are printed using a white color to provide a high contrast background against the catheter when viewed by the cameras. This design choice helps with more accurate isolation and segmentation of the catheter from the background by the computer vision tracking algorithm. The setup, including all sections and backgrounds, is 3D printed from rigid materials (VeroClear and VeroUltraWhite for backgrounds) on a Stratasys J826 PolyJet printer.

2.2. Computer Vision-Based Catheter Segmentation and 3D Trajectory Extraction

The main component enabling real-time tracking of the catheter position is a computer vision segmentation algorithm that first identifies and segments the catheter and then reconstructs the full 3D trajectory from synchronized orthogonal camera footage captured from the workspace. The processing pipeline operates on a per-frame basis, taking in video streams from two cameras positioned orthogonally around the 3D cube model. The cameras are intrinsically calibrated and have known, fixed poses relative to the prespecified coordinate space. The proposed vision algorithm has nine steps, as illustrated in Fig. 3.

Each frame of the real-time video is denoted by I(r, c, t), where r, c, and t represent the pixel row, pixel column, and time dimensions, respectively. To start, every frame undergoes a perspective transformation using a perspective transformation matrix (Eq. 1) mapping the front and the top cameras to the same global coordinate space, based on the known fiducial points (denoted by m_T, and m_F). Vectors m_T and m_F, each consisting of eight coordinate values (m_Ti and m_Fi for i = 1,2,3, and 4 as depicted in Fig. 1), are used to specify the four corners of the ROI for the top and front planes, respectively. These vectors are marked on the two printed crosses within the setup (Fig. 1). The perspective transformation facilitates consistent image processing in the unified coordinate space and cancels out the distortion. As expressed in Eq. 1, the transformed coordinate points (x', y') are obtained from the local camera-based coordinates (x, y). The coordinate transformation matrix implements an affine transformation using rotation and scaling (a1 to a4), translation (b1 and b2), and projection (c1 and c2). The parameters of the transformation matrix are initially unknown. To obtain these parameters, a set of eight equations need to be solved. To do so, as an initial step, four fiducial points are selected within the input image. Subsequently, these predefined points are mapped to predetermined locations based on the known dimensions and positions of ROIs. This procedure yields a system of eight equations and eight unknowns, allowing for a solvable configuration. We can then compute the resulting perspective transformation matrix. After the acquisition of the transformation matrix, all the input frames undergo perspective transformation, yielding coordinate-normalized images. Matrix parameters are acquired using the initial frame of each video, enabling seamless application across subsequent frames in real-time scenarios.

$$\left[\begin{array}{c}X{\prime }\\ Y{\prime }\\ 1\end{array}\right]=\left[\begin{array}{ccc}{a}_{1}& {a}_{2}& {b}_{1}\\ {a}_{3}& {a}_{4}& {b}_{2}\\ {c}_{1}& {c}_{2}& 1\end{array}\right]\left[\begin{array}{c}X\\ Y\\ 1\end{array}\right]$$

With the frames aligned, preprocessing steps are applied including Gaussian smoothing to reduce background noise and enable robust detection; a larger Gaussian kernel size increases the smoothness but can also degrade localization precision. Next, contrast/brightness adjustments enhance the visibility of the catheter against the white background. To isolate the catheter, an adaptive thresholding operation is applied, which converts the grayscale frame into a binary image based on dynamic local thresholding. The adaptive thresholding algorithm computes individualized threshold value, denoted by T(x, y), for each pixel (x, y), by averaging the pixel values present in a localized neighborhood window centered on the target pixel. In other words, Adaptive thresholding is as follows:

T(x,y) = mean(neighborhood) – C (2)

Where mean(neighborhood) is the mean of the pixel values within the neighborhood proximity window. To further normalize the pixel intensity threshold, the constant C, is subsequently subtracted from this mean value. C is a customizable constant that can be positive, negative, or zero (but is typically positive). A positive C raises the threshold leading to a darker image, a negative C lowers it leading to a brighter image, and zero uses the mean directly. The optimal value for C depends on the specific image and desired outcome. It's generally chosen through experimentation to achieve the best possible segmentation or object detection results. Each pixel's final value, represented by g(x, y), is then obtained by masking the input frames using the computed threshold as in:

$$g\left(x,y\right)=\left\{\begin{array}{c}1, PV(x,y) > T(x,y) \\ 0, PV(x,y) \le T(x,y)\end{array}\right.$$

Where the variable PV(x, y) denotes the pixel value within the input image. Consequently, the process involves binary classification of each pixel, wherein the adaptation to a locally determined threshold takes precedence over a globally assigned value for the entire image. The configurability of this method is manifested through the adjustable parameters of the neighborhood block size and the constant C. While larger block sizes serve to mitigate the impact of noisy artifacts, it is imperative to acknowledge that they concurrently introduce a trade-off by diminishing the precision of localization.

In the subsequent stage of the proposed vision algorithm, a morphological operation for skeletonization is introduced to attenuate the binary representation of the catheter object, shaping it into a central, pixel-wide arc that characterizes the medial axis trajectory. This transformation yields a concise portrayal, streamlining the subsequent tracking process. Distinct skeletonization methods are available, such as Zhang's method ²¹ and Lee’s method ²². In this instance, Zhang’s method has been implemented, which involves a series of sequential passes across the image, systematically eliminating pixels situated on the periphery of the object. This iterative process persists until further removal of pixels becomes unattainable.

Initially, the tip location is determined by identifying the point with the maximum number of neighboring true values (white pixels) ²³. The coordinates of the inferred tip are then recorded as the first data point within an array, with the corresponding pixels being set to true values in the skeletonized binary catheter image. This procedure is repeated iteratively until all pixel’s transition to true values, thereby documenting the coordinates of the entire catheter trajectory from its tip to its entry point. This yields a two-dimensional directional array (vector) that succinctly encapsulates the spatial trajectory of the catheter for each frame—from its tip to its entry point. These pixel coordinates are subsequently mapped onto a real-world 3D coordinate system in millimeters, utilizing known phantom dimensions and camera intrinsic parameters. The coordinates are further down sampled to generate a subset of K points, where the initial point denotes the catheter's tip, the concluding point designates the entry point, and the intervening K-2 points are evenly distributed to delineate the catheter's curvature. The computer vision algorithm is implemented in Python using the OpenCV library and operates in real-time on both planes. The resulting K points infer the X, Y coordinates from the top plane and Z coordinate from the front plane. Combining these points results in a real-time 3D K-point tracking system, providing a holistic representation of the catheter's spatial configuration and its curvature.

2.3. Patient-Specific 3D Heart Model Generation

To generate a patient-specific 3D model of the heart, we utilized Materialize Mimics Research software version 21.0 for 3D image processing. The initial step involved importing a cardiac computed tomography (CT) scan acquired at end-diastole in the DICOM format, as demonstrated in Fig. 4-a. Subsequently, this data underwent image segmentation within Mimics Software to delineate the heart and spine, forming a unified 3D mask while preserving their respective spatial relationships. The resulting 3D segmentation was saved as an STL file. To further refine the model and eliminate extraneous components, such as vessels and ribs, we employed Geomagic Wrap software developed by 3D Systems Geomagic Corporation. This refinement process effectively removed artifacts and smoothed the mesh, as illustrated in Fig. <link rid="fig4">4</link>-b and 4-c. The generated 3D heart is used in the proposed XR scene. The initial step involved exporting the model from Geomagic as an STL file, which was then imported into the CAD software SolidWorks. Within SolidWorks, meticulous attention was given to the positioning of six posts inside the Right Atrium, ensuring optimal accessibility from the inferior vena cava using a catheter (Fig. 4-d). Each post was constructed from an extrusion of a 2.9 mm diameter circle, resulting in a 4 mm long cylinder. The xyz coordinates were precisely measured at the center of each post (Fig. 4-e). Upon finalizing the model, four spherical markers, each with a diameter of 3 mm, were added parallel to the spine base (Fig. 4-f). These markers will serve as reference points for orientation and integration within the Unity scene system. Finally, the entire model was exported as a single GLTF file, with a bin file associated with each individual element (spine, heart, post 1, post 2, etc.), facilitating its seamless integration into the XR scene.

2.4. Mixed Reality Rendering

Employing the Unity game engine, we have integrated the inferred tracking data of a catheter and a patient-specific cardiac model into a mixed reality application tailored for the Meta Quest 3 headset. Based on the steps explained in section 2.3, a 3D model of the patient's heart, reconstructed from a CT scan ^9,10,24, is rendered using the Unity game engine. Furthermore, the catheter's spline obtained from the inferred K points, acquired in real-time by our proposed computer vision algorithm, are also rendered in the scene. For this mapping to be accurate, alignment of fiducial points on the 3D printed model with corresponding anatomical landmarks is essential.

The communication between the output of the Python-based computer vision algorithm and the 3D rendering in Unity, is implemented by using the web-based library Flask, which is a common choice for efficient real-time data transfer. More specifically, inferred tracking data comprised of the position of each of the K points on the catheter, as well as the rotation of the tip is saved using a JSON file and then transmitted to Unity through a Flask-based WebSocket connection. The Flask server can be run on any IP address and port that the user specifies, with the default being localhost at port 5000. To connect this data to the Unity application, the user is prompted to enter this IP address and port when they launch the app and connect to the server using the SocketIO Unity package. After receiving the transmitted tracking data, Unity renders dynamic updates to both the catheter location and curvature in the 3D scene. Using the fiducial markers, an affine transformation is used to correctly map the catheter placement and movement with respect to the 3D model of the heart. The catheter rendering is superimposed within the patient-specific heart rendering and surrounding scene within the Quest 3 headset, creating a cohesive XR environment. This integrated platform offers an interactive experience that goes beyond mere visualization, allowing users to observe and analyze the catheter's movements in real time with quantitative feedback based on tracking relative to pre-defined target positions. The visual representation is contextualized within the reconstructed cardiac anatomy, providing a valuable tool for training simulations. Moreover, this innovative approach holds considerable potential for real-life applications in clinical settings, where the interactive XR platform could contribute to improved catheterization procedures through enhanced training and procedural guidance.

Real-time pose detection of a catheter is a critical need for a XR system to perform accurate three-dimensional tracking to provide guidance and navigation for clinical procedures. Integrating this capability into a XR system can enhance medical training, allowing practitioners to refine dexterity, spatial awareness, and procedural skills. Moreover, the detailed 3D representation of the catheter aids in preoperative planning, enabling clinicians to strategize and optimize their approach, thereby reducing errors during actual interventions. This approach, complemented by the proposed K-points tracking algorithm (i.e., arbitrary number of pre-defined points tracked along the length of a catheter), was evaluated through comprehensive experiments using both a commercially available catheter and 3D-printed catheter phantoms characterized by predetermined angles and curvatures.

To evaluate the system's adaptability to diverse commercial catheter types, we conducted initial tests with two specific catheters: the OSCOR Sheath and the PHILIPS Intracardiac Echocardiography (ICE) catheter. Samples of each catheter are depicted in Fig. 5. For a more thorough demonstration of the system's functionality with these catheters, additional videos can be found in the Supplementary materials (Supplementary Videos Sa, and Sb). As another evaluation, our assessment encompassed a dual-focused approach, including the analysis of tip tracking accuracy with the OSCOR Sheath and a scrutiny of the algorithm's capacity for 3D pose reconstruction across the entire catheter length using a mock 3D-printed catheter. The outcomes of these experiments will yield valuable insights into the algorithm's potential application and performance in practical clinical scenarios, such as cardiac interventions. This, in turn, aims to enhance procedural precision, efficiency, and overall patient care by minimizing complications in catheter-based procedures.

3.1 Tip Localization Accuracy

As an initial test, the accuracy of localizing the catheter tip position was quantified. Five fiducial points (P1-P5 in Fig. 6) with known 3D coordinates were 3D printed onto the boundary of interest. The commercial catheter was maneuvered such that its tip touched each fiducial point, with the contact captured by the biplane cameras. The absolute difference of 3D coordinates between the reconstructed tip position (measured) from the computer vision algorithm and the actual fiducial coordinates was computed as shown in the table in Fig. 6-d. As you can see in this table, the average tip localization error is less than 1 mm in 3D coordinates. This level of accuracy provides confidence in the ability to precisely reconstruct the catheter's position.

3.2 3D Pose Reconstruction of different Catheter Configurations

To evaluate the reconstruction of full catheter trajectories, experiments were conducted with 3D printed catheter phantoms designed with three different fixed angles: 30°, 45°, and 60° (Fig. 7-a). Each of the 3D printed and fixed angles catheter phantoms was positioned at four different locations within the workspace, aligned to the inlet hole. At each position, the catheter phantom was rotated manually at various speeds while both cameras captured video for processing. The K-points tracking algorithm processed the biplane videos, reconstructing the full 3D trajectory over time. Qualitative results depicting the reconstructed 3D trajectories across different angles and positions are shown in Fig. 7-b-d for 30°, 45°, and 60° fixed angle catheters, respectively. For quantitative assessment, the catheter's shape was isolated from specific frames and the angles were calculated from the coordinates of each of the K-points. Results showed an average reconstruction error of ~ 0.6°, ~ 0.37°, and ~ 0.1° for the 30°, 45°, and 60° catheters, respectively, across all k-points. Representative 2D projections of the reconstructed shapes are provided in Fig. 7-e-g. The low reconstruction error highlights the capability to reliably reconstruct diverse catheter configurations. In Supplementary Videos Sc to Se, you can see several sample videos for this test.

To subject the system to a more challenging test, we fabricated three distinct catheters with arbitrary curves using 3D printing, as illustrated in Fig. 8-a. Each 3D-printed catheter was securely positioned at a fixed location within the workspace, aligning with the inlet hole. Subsequently, the catheter phantom underwent manual rotation at various speeds while cameras recorded footage for subsequent processing. The K-points tracking algorithm analyzed the biplane videos, reconstructing the complete 3D trajectory over time. Qualitative results, showcasing the reconstructed 3D trajectories across different curves. To validate the fidelity of the reconstructed shape against the designed curved catheter form, we extracted the 2D view of the 3D reconstructed trajectory. Simultaneously, we extracted the 2D view of the 3D designed curved catheters from SolidWorks, which was used to design the curved catheters. We manually superimposed the designed catheters onto the 2D view of the reconstructed data. As depicted in Figs. 8-b-d, a high degree of shape matching is evident, confirming that the reconstructed catheter shape precisely mirrors the design of the test catheter. In Supplementary Videos Sf to Sh you can see several sample videos for this test.

3.3 Extended Reality (XR) Results

As shown in our previous work, the Unity engine's rendering capabilities enabled the presentation of a patient-specific heart, overlaid with real-time tracking of the catheter position and shape⁹. This work improved upon that XR scene with several improvements: i) No embedded wires are required in the catheter for EM sensor tracking, making it compatible with various commercially available catheters or sheaths. In contrast, previous methods necessitated embedding at least three wires into the catheter, which proved to be expensive, time-consuming, and unfeasible for a variety of catheters, such as ICE catheters. ii) The catheter is rendered using the new spline function that is composed of k-points (instead of just 3 that were tracked using the EM sensors) improving the fidelity of the rendered catheter shape, iii) The software was made compatible with the Meta Quest 3 headset, instead of just the Hololens 2, and the scene was programmed to allow for 6 pre-defined targets within the heart, instead of just 3 that were hardcoded to recapitulate a transseptal puncture. As shown in Fig. 9, the visualization includes a central 3D view of the heart, with additional 2D view on the left, as well as endoscopic view and a close-up target view on the right side of the scene. This interactive XR platform allowed users to explore catheter navigation in relation to the intricate cardiac anatomy reconstructed from medical scans. The immersive experience allows the user to move their head or walk around while the headset maintains the rendered scene in the exact position it was set, preventing the scene from blocking the view of the user to see anything else in the room (i.e., patient, other imaging). Additionally, using the voice command “Reset View,” the user can move the scene directly in front of their current head position, allowing for additional flexibility This XR system serves as a versatile tool for visualizing catheterization procedures, providing users with real-time feedback on the catheter's motion relative to pre-defined targets.

3.3.1 Functionality and User Experience Assessment

To evaluate the functionality and user experience of the proposed XR system for guiding a percutaneous transcatheter procedure, we conducted an experimental study involving six participants. This study was carried out in accordance with the guidelines and regulations of the Institutional Review Board (IRB) at Weill Cornell Medicine, which gave approval for this study to be conducted. Participants were employees at Weill Cornell Medicine, but were not authors of this paper, and all provided informed consent for participation in the study. It should be noted that participants were not cardiologists or interventionalists with prior experience with delivery catheters. The primary objective was to assess both the quantitative performance metrics and qualitative feedback regarding the system's efficacy in navigating a catheter through predetermined targets within the heart using different visualization modes, allowing us to assess the significance of 3D depth perception in localizing a catheter. The experimental setup began with a structured familiarization period, allowing participants a maximum of 10 minutes to acquaint themselves with the XR system and the catheter. This period was crucial to ensure that participants had a basic understanding of the system's operation and functionality before engaging in the experimental sessions. Following the familiarization period, participants engaged in three distinct sessions, each designed to evaluate a specific visualization mode:

2D-View

In this session, participants were presented with two-dimensional representations (Fig. 10-Left) as would be seen in standard 2D monitors. The objective was to navigate the catheter to predetermined targets without any depth perception, such that mental coordination between multiple views is needed to triangulate a 3D position in space.

2D3D-View

In this session, participants were presented with the same two-dimensional views as in Session 1, alongside a 3D mixed reality representation of the heart in the center of the display (Fig. 10-Middle). This hybrid visualization mode aimed to provide participants with additional spatial context to aid in target navigation utilizing the immersive nature of mixed reality headsets.

3D-View

In this session, participants solely engaged with the 3D representation of the heart (Fig. 10-Right). This visualization mode aimed to simplify the attention of the user to only the 3D rendering to determine if the additional views from the 2D modes are beneficial, distracting, or inconsequential.

The order of sessions was randomized to mitigate potential order effects and ensure the validity of the experimental findings. Throughout each session, meticulous records were kept, capturing the following key metrics:

t (Time): This metric represents the time duration (in seconds) taken by participants to complete the task within each session. It serves as a quantitative measure of procedural performance.

nT (number of Targets): This metric denotes the number of predetermined targets successfully reached by participants during each session. It serves as a quantitative measure of procedural accuracy and success rate.

tT (time per Target): This metric calculates the average time taken by participants to reach each individual target within a session.

%tT/2D (percentage of tT relative to tT of 2D-View): This metric compares the time taken by participants to reach targets in alternative visualization modes (2D3D-View and 3D-View) relative to the baseline 2D-View mode. It quantifies the efficiency gains achieved with the alternative visualization modes and serves as a comparative measure of procedural speed.

The quantitative results of this experiment, as shown in Fig. 11 (see Table S1 for all results), underscore the significant advantages of a 3D-View for guiding catheterization procedures over 2D-View and even the 2D3D-View. The 3D-View consistently yielded the fastest task completion times and the highest number of successfully reached targets by all participants, irrespective of age group, indicating its clear performance superiority. Participants unanimously reported a superior experience with the 3D-View compared to other visual representations, a preference echoed across age groups. Despite varying levels of difficulty encountered, the 3D-View markedly facilitated more efficient performance compared to 2D and 2D3D views, as evidenced by time per target (tT) and percentage of tT relative to 2D-View session (%tT/2D). The %tT/2D metric illustrates that 2D3D-View and 3D-View allowed participants to reach targets approximately 3.1 (%tT/2D = %32.5) and 7.3 (%tT/2D = %13.6) times faster, respectively, than with 2D-View alone, signifying substantial speed and efficiency improvements with important implications for real-world interventional procedures. While 2D3D-View provided all information available in 3D-View, the additional 2D views appeared to distract from user performance (although not statistically significant with this low sample size), suggesting that the 3D-View alone is optimal for this procedure.

To conduct a qualitative assessment, we have developed a survey comprising eight questions that explore various aspects of the participants' experiences. When asked to rate their experience on a scale of 1 to 5 (with 5 being the highest), the subjects unanimously agreed that the 3D view mode was significantly more helpful than the 2D view mode for improving the accuracy and speed of catheter placement (almost all subjects rated 5). This suggests that the 3D visualization provided by the XR system greatly enhanced the subjects' spatial comprehension and procedural skills during the simulated cardiac intervention. Furthermore, the subjects found the digital interface of the XR image-guidance system to be generally user-friendly (average rating of 4.2). They also expressed a high degree of confidence in the system's utility for both pre-operative training/planning and intra-operative procedures (average ratings of 5 and 4.5, respectively). The overall experience with the system was rated very positively, with an average score of 4.8 out of 5. When asked about their experience with video games, the subjects reported a wide range of experience levels, with an average rating of 2.7 out of 5. However, their experience with XR headsets was relatively low, with an average rating of 1.5 out of 5. These survey results demonstrate the potential of XR technologies to transform medical training and real-world procedures through enhanced visualization and interactivity. The subjects' feedback suggests that the developed XR image-guidance system could significantly improve spatial comprehension and procedural skills for complex cardiac interventions.

This paper presents a novel XR platform for real-time 3D catheter tracking and visualization during simulated percutaneous cardiac procedures. The proposed computer vision algorithm leverages biplane imaging to accurately reconstruct varied catheter configurations within a 3D printed setup. Integration with the Meta Quest 3 headset enabled immersive visualization of the tracked catheter trajectory of a reconstructed patient-specific heart model.

The comprehensive experiments showcase precise tip localization of the catheter within approximately 1 mm of error, coupled with a high fidelity of shape reconstruction along its length. Our experimental study, involving six participants, revealed that the 3D visualization provided by the proposed XR system significantly outperformed 2D visualization, both in terms of speed and user experience. These findings strongly suggest that the XR system holds promise for enhancing catheterization training by bolstering spatial comprehension and procedural skills. Such results underscore the system's potential to revolutionize transcatheter cardiac procedures training through enhanced spatial comprehension. The XR environment offers an interactive and intuitive platform for honing the intricate psychomotor skills essential for catheter manipulation within patient anatomy.

This study, however, has two primary limitations that should be further explored in future work. These two limitations are: i) Small sample size. Although the results show statistically significant differences, the sample size is still small and therefore a larger study should be performed to show generalization of the impact of the differences seen between three viewing modes. ii) Subjects are non-cardiologists. All participants in this study were non-cardiologists or even interventionalists and therefore the conclusions can’t necessarily be generalized to the performance of trained individuals. However, this study does imply great impact for the learning curve of untrained individuals.

Overall, this innovative approach shows promise in transforming training and future percutaneous transcatheter cardiac procedures through the combined benefits of enhanced 3D visualization, immersive interactions, and patient-specific reconstruction. While the current results validate the technical feasibility of the approach, further research can help translate this technology to practical clinical deployment. Through progressive research to enhance the visualization, expanding the clinical integration, and validating the training outcomes, this XR catheter tracking platform aims to effectively prepare next-generation cardiologists to master this vital and complex procedural skill.

Competing Interests

The authors declare no competing interests.

Author Contribution

Methodology and Machine vision algorithm design, M.A.; Coding and implementation of the algorithm, M.A.; Design and implementation of the 3D setup, M.A.; Testing and investigation, M.A.; Experimental study , and Data analysis, M.A.; Front-end development and coding, S.S.; Unity implementation and mixed reality, S.S.; Testing and investigation, S.S.; Code repository handling, S.J.; Testing and investigation, S.J.; Heart model and target design, A.C.; Resources, S.C.W.; Front-end project supervision, A.S.; Conceptualization, B.M.; Project administration, B.M.; Supervision, B.M.; Funding acquisition, B.M. All authors have read and agreed to the published version of the manuscript.

Acknowledgments

We thank Philips for providing the VeriSight ICE catheter supported by a research grant.

Data Availability

Data is provided within the manuscript or supplementary information files.

Bashore, T. M. et al. 2012 American College of Cardiology Foundation/Society for Cardiovascular Angiography and Interventions expert consensus document on cardiac catheterization laboratory standards update: A report of the American College of Cardiology Foundation Task Force on Expert Consensus documents developed in collaboration with the Society of Thoracic Surgeons and Society for Vascular Medicine. J Am Coll Cardiol 59, 2221–2305 (2012). https://doi.org:10.1016/j.jacc.2012.02.010
Malik, P. Grossman’s Cardiac Catheterization, Angiography, and Intervention. 7 edn, Vol. 23 (2007).
Gaba, P. et al. Percutaneous Coronary Intervention vs Coronary Artery Bypass Graft Surgery for Left Main Disease in Patients With and Without Acute Coronary Syndromes: A Pooled Analysis of 4 Randomized Clinical Trials. JAMA Cardiology 8, 631–639 (2023). https://doi.org:10.1001/jamacardio.2023.1177
Okumus, N., Abraham, S., Puri, R. & Tang, W. H. W. Aortic Valve Disease, Transcatheter Aortic Valve Replacement, and the Heart Failure Patient. JACC: Heart Failure 11, 1070–1083 (2023). https://doi.org:doi:10.1016/j.jchf.2023.07.003
Bangalore, S. et al. Evidence-Based Practices in the Cardiac Catheterization Laboratory: A Scientific Statement From the American Heart Association. Circulation 144, e107-e119 (2021). https://doi.org:10.1161/cir.0000000000000996
Yoon, S.-H. et al. Outcomes After Transcatheter Edge-to-Edge Mitral Valve Repair According to Mitral Regurgitation Etiology and Cardiac Remodeling. JACC: Cardiovascular Interventions 15, 1711–1722 (2022). https://doi.org:doi:10.1016/j.jcin.2022.07.004
Holmvang, L., Lüscher, M. S., Clemmensen, P., Thygesen, K. & Grande, P. Very Early Risk Stratification Using Combined ECG and Biochemical Assessment in Patients With Unstable Coronary Artery Disease (A Thrombin Inhibition in Myocardial Ischemia [TRIM] Substudy). Circulation 98, 2004–2009 (1998). https://doi.org:doi:10.1161/01.CIR.98.19.2004
Sorajja, P., Michael, J. L. & Morton, J. K. Cardiac Catheterization Handbook E-Book. (Elsevier Health Sciences, 2021).
Jang, S.-J. et al. Development of a Hybrid Training Simulator for Structural Heart Disease Interventions. 2, 2000109 (2020). https://doi.org:https://doi.org/10.1002/aisy.202000109
Liu, J. et al. An augmented reality system for image guidance of transcatheter procedures for structural heart disease. PloS one 14, e0219174 (2019). https://doi.org:10.1371/journal.pone.0219174
Wu, D. et al. Comparative Analysis of Interactive Modalities for Intuitive Endovascular Interventions. IEEE Transactions on Visualization & Computer Graphics, 1–18 (5555). https://doi.org:10.1109/tvcg.2024.3362628
Joshi, A. & Wragg, A. Simulator Training in Interventional Cardiology. Interv Cardiol 11, 70–73 (2016). https://doi.org:10.15420/icr.2016.11.1.70
Mercuri, M., Sheth, T. & Natarajan, M. K. Radiation exposure from medical imaging: a silent harm? Cmaj 183, 413–414 (2011). https://doi.org:10.1503/cmaj.101885
Stahl, C. M., Meisinger, Q. C., Andre, M. P., Kinney, T. B. & Newton, I. G. Radiation Risk to the Fluoroscopy Operator and Staff. AJR Am J Roentgenol 207, 737–744 (2016). https://doi.org:10.2214/ajr.16.16555
De Ponti, R. et al. Superiority of simulator-based training compared with conventional training methodologies in the performance of transseptal catheterization. J Am Coll Cardiol 58, 359–363 (2011). https://doi.org:10.1016/j.jacc.2011.02.063
Viglialoro, R. M. et al. Augmented Reality, Mixed Reality, and Hybrid Approach in Healthcare Simulation: A Systematic Review. Applied Sciences 11, 2338 (2021).
Gallagher, A., McClure, N., McGuigan, J., Crothers, I. & Browning, J. Virtual reality training in laparoscopic surgery: a preliminary assessment of minimally invasive surgical trainer virtual reality (MIST VR). Endoscopy 31, 310–313 (1999).
Linte, C. A., Moore, J., Wiles, A. D., Wedlake, C. & Peters, T. M. Virtual reality-enhanced ultrasound guidance: a novel technique for intracardiac interventions. Comput Aided Surg 13, 82–94 (2008). https://doi.org:10.3109/10929080801951160
Barsom, E. Z., Graafland, M. & Schijven, M. P. Systematic review on the effectiveness of augmented reality applications in medical training. Surgical endoscopy 30, 4174–4183 (2016).
Klein, L. W. et al. Occupational health hazards in the interventional laboratory: time for a safer environment. Radiology 250, 538–544 (2009). https://doi.org:10.1148/radiol.2502082558
Zhang, T. Y. & Suen, C. Y. A fast parallel algorithm for thinning digital patterns. Commun. ACM 27, 236–239 (1984). https://doi.org:10.1145/357994.358023
Lee, T. C., Kashyap, R. L. & Chu, C. N. Building Skeleton Models via 3-D Medial Surface Axis Thinning Algorithms. CVGIP: Graphical Models and Image Processing 56, 462–478 (1994). https://doi.org:https://doi.org/10.1006/cgip.1994.1042
Annabestani, M. & Naghavi, N. Non-uniform deformation and curvature identification of ionic polymer metal composite actuators. Journal of Intelligent Material Systems and Structures 26, 582–598 (2014). https://doi.org:10.1177/1045389X14538535
Torabinia, M. et al. Deep learning-driven catheter tracking from bi-plane X-ray fluoroscopy of 3D printed heart phantoms. Mini-invasive Surgery 5, 32 (2021). https://doi.org:10.20517/2574-1225.2021.63

No competing interests reported.

Download PDF

Editorial decision: Revision requested
26 Aug, 2024
Reviews received at journal
07 Aug, 2024
Reviews received at journal
04 Aug, 2024
Reviewers agreed at journal
28 Jul, 2024
Reviewers agreed at journal
16 Jul, 2024
Reviewers invited by journal
06 Jul, 2024
Editor assigned by journal
06 Jul, 2024
Editor invited by journal
02 Jul, 2024
Submission checks completed at journal
02 Jul, 2024
First submitted to journal
26 Jun, 2024

You are reading this latest preprint version

High-Fidelity Pose Estimation for Real-Time Extended Reality (XR) Visualization for Cardiac Catheterization

Status:

Version 1

Abstract

Figures

1. Introduction

2. METHODS

2.1. Design and Fabrication of the 3D Printed Setup

2.2. Computer Vision-Based Catheter Segmentation and 3D Trajectory Extraction

2.3. Patient-Specific 3D Heart Model Generation

2.4. Mixed Reality Rendering

3. RESULTS AND DISCUSSION

3.1 Tip Localization Accuracy

3.2 3D Pose Reconstruction of different Catheter Configurations

3.3 Extended Reality (XR) Results

3.3.1 Functionality and User Experience Assessment

4. CONCLUSION

Declarations

Competing Interests

Author Contribution

Acknowledgments

Data Availability

References

Additional Declarations

Status:

Version 1