Since the emergence of pre-trained language models, such as the GPT series including GPT4, their affiliated products have demonstrated remarkable capabilities across various fields. In the field of medicine, the application of the GPT series has gained widespread adoption. Ophthalmology, characterized by its high level of specialization and complexity, stands to benefit from the utilization of artificial intelligence technology, as it has the potential to assist healthcare professionals in making more comprehensive, accurate, and personalized diagnoses and treatment plans. This amalgamation of artificial intelligence and ophthalmology holds substantial promise in significantly contributing to the improvement of ocular health conditions for patients. However, it is crucial to ensure the reliability and safety of employing artificial intelligence in medical practice while addressing pertinent concerns surrounding privacy and ethical considerations.
Background: Since the advent of pre-training generators, the application of intelligent chat robots such as ChatGpt in medicine has become a research hotspot.
Implementation: Investigating the potential of GPT-4 in recognizing OCT images of retinal diseases. We selected OCT images from 80 patients who sought treatment at the Department of Ophthalmology, the Second Affiliated Hospital of Harbin Medical University. These OCT images exhibited typical disease characteristics. Before experimenting, we established disease diagnoses based on previous cases. The images were processed and uploaded to GPT-4 for diagnostic analysis using various approaches. The accuracy rates were recorded and subsequently compared with the diagnoses made by retinal disease specialists.
Results: In four distinct experiments, the diagnostic accuracy of GPT-4 was consistently around 26%, which showed a significant disparity compared to the diagnostic accuracy of retinal disease specialists (93.75%). Leveraging GPT-4's interactivity, users had the option to provide relevant information before posing questions, guiding GPT-4 to generate more contextually appropriate responses. This approach appeared feasible, mostly for addressing objective queries, and limited to textual information and simple images. However, when faced with complex or highly specialized images, its performance still requires improvement. Currently, GPT-4 is not a substitute for professional physicians.
Conclusions: The performance of GPT-4 in processing information may deteriorate when subjected to an excessive amount of input data. Optimizing the presentation of information could facilitate more efficient utilization of GPT-4 for problem-solving. This necessitates a substantial amount of domain expertise. Therefore, to leverage AI models for disease diagnosis, it is crucial to distill concise and representative disease features. Simultaneously, there is a pressing need for specialized AI models dedicated to healthcare to be developed and introduced as early as possible, as this would significantly enhance the diagnostic efficiency of healthcare professionals.