The development of ChatGPT has been a major milestone in the field of NLP (Natural Language Processing) and AI. [2] Its continuous improvement and evolution is likely to have a major impact on the future of conversational AI. The consequences are sure to affect all sectors of society, from business development to medicine, from education to research, from coding to entertainment and art. [13]
ChatGPT, as a cutting-edge and massive language model, is capable of learning from vast amounts of text data to generate human-like language. [2] In the education sector, the utilization of ChatGPT offers personalized learning materials and the ability to answer questions related to medical student exams. [10] Through utilizing its advanced natural language processing capabilities, ChatGPT effectively enhances the overall learning experience for students, making it more efficient and engaging. Its ability to process and comprehend natural language inputs combined with its vast knowledge base makes it a valuable tool for medical educators to enhance the learning process and support student success.
With its unique advantages, ChatGPT can be used in medical education for various purposes to enhance the quality of education. ChatGPT can effectively be used for evaluating students' essays and papers, analyzing sentence structure, vocabulary, grammar, and clarity of the paper. [14] Another use of ChatGPT is its ability to generate exercises, quizzes, and scenarios which can be used in the classroom to aid in practice and assessment. Additionally, ChatGPT is capable of writing basic medical reports, which assists students in identifying areas for improvement and deepening their understanding of complex medical concepts. [15] Its ability to generate translations, explanations, and summaries can also be used to help students understand complex learning material more easily. [16] ChatGPT can be used to provide accurate and up-to-date information on medical topics upon immediate notification. This could include a range of topics from diseases and their treatments to medical procedures. [17] For these applications to be effective, ChatGPT must perform similarly to human experts in medical knowledge and reasoning tests so that users have confidence in its responses.
In both 2021 and 2022, ChatGPT's scores in the Chinese National Medical Licensing Examination did not meet the passing requirements. According to statistics, the national pass rate of the exam in 2021 was 50%, and in 2022 was 55%. Compared to medical students who have undergone traditional 5-year medical education in a medical school, ChatGPT's performance is currently not sufficient. This may be due to several reasons. Firstly, all questions in the Chinese NMLE are multiple-choice questions that require selection of the best answer, while some questions have multiple-choice answers provided by ChatGPT, which are considered as suboptimal rather than incorrect in clinical practice. Secondly, there are differences in medical policies and laws between China and the United States, such as issues related to abortion, which is not allowed in the United States by law, while it is allowed in China under certain medical conditions. Thirdly, some unique epidemiological data in China are beyond the knowledge scope of ChatGPT, and some data are only available in Chinese. This phenomenon is similar to the situation mentioned in a paper published by Korean colleagues. In addition, some question types in the Chinese NMLE are based on a patient-centered clinical scenario, followed by 2 to 3 related questions, each of which is related to the initial clinical scenario but tests different points, and the questions are independent of each other. ChatGPT performed poorly on these questions.
However, we have also observed several interesting phenomena. ChatGPT performed relatively well in Unit 4, which covers subjects such as pediatrics, gynecology, and surgery and is less affected by national conditions. ChatGPT's performance in the 2021 exam was higher than that in the 2020 exam, which may be due to more people seeking relevant information online to prepare for the exam, allowing ChatGPT to learn more knowledge through big data. We tried 10 questions that ChatGPT answered incorrectly and after being told the correct answers, ChatGPT was able to provide a correct answer in response (data not shown).
The above results should not be extrapolated to other subjects or medical schools, as chatbots are likely to continue to rapidly evolve through user feedback. Future trials with the same items may yield different results. The present results reflect the abilities of ChatGPT on February 1, 2023. The input for the question items for ChatGPT was not exactly the same as for medical students. The chatbot cannot receive information about differences in medical policies between the US and other regions, and this information needs to be learned by the software. Additionally, the interpretation of explanations and correct answers may vary depending on the perspectives of clinical experts, although the author has been working in the field of medicine in China for 15 years (2009–2023). Patient care best practices may also vary depending on the region and medical environment.
At present, ChatGPT's level of understanding and ability to interpret information is not adequate to be utilized by medical students, particularly in medical school exams and high-stakes exams such as health licensing exams. However, it is anticipated that with deep learning, ChatGPT's knowledge and interpretation abilities will improve at a rapid pace, similar to AlphaGo's performance. Thus, medical and health professors and students should be mindful of incorporating this AI platform into medical and health education in the near future. In addition, the integration of AI into the medical school curriculum is already underway in some institutions.
In conclusion, ChatGPT's proficiency and interpretation abilities for questions pertaining to the Chinese NMLE are not yet at par with Chinese medical students. Nevertheless, it is probable that these abilities will improve through deep learning. Medical education authorities and students ought to be cognizant of the developments in this AI chatbot and ponder its potential utilization in learning and education. In conclusion, this research indicates that ChatGPT holds the possibility of serving as a virtual medical mentor, but further examination is required to fully evaluate its efficiency and applicability in this context.