Advances in Large Language Models (LLMs) and Artificial Intelligence (AI)

The recent advances in Large Language Models (LLMs) and artificial intelligence (AI), such as ChatGPT and Bard, have garnered significant interest across many fields. These LLMs are trained on vast datasets of texts, including medical and academic content. This allows them to generate human-like responses and informative answers. Within healthcare, medicine, and education, many peer-reviewed studies on their applications have already been published. Some studies explore their applicability in digesting and writing academic content, while others examine their utility as study aids that help break down complex concepts. Although generally accurate, LLMs are still prone to AI hallucinations and may occasionally produce inaccuracies.

Applications in Medical Question-Answering and Education

LLMs have already been shown to adequately address medical question-answering tasks in patient education.1​ Many are now exploring their broader applications in medical education. ChatGPT has shown promise in responding to questions beyond specific medical domains, including knowledge retrieval, clinical decision support, and patient triage.2,3​

Recent research using multiple-choice questions from the United States Medical Licensing Exam (USMLE) suggests that ChatGPT can approximate the performance of a third-year medical student. Although these were multiple-choice exams, the USMLE remains a gold standard in the United States and many other countries globally. The newer and more powerful GPT-4 has also demonstrated excellence in answering exam questions. Similarly, other models like BARD have shown similar performances across surgical membership exams.4​

While their performances were promising in both undergraduate and postgraduate exams, the role, implication, and impact of LLMs in medical education and clinical practice should be examined too.

Benefits for Medical Students and Clinicians

These applications of LLMs can be of great assistance to future clinicians. Their ability to digest complex medical information can be invaluable to students struggling to understand certain topics. Similarly, their application in providing differential diagnoses based on presenting signs and symptoms can offer students much-needed interactive learning while developing their knowledge foundation. While LLMs' datasets are not necessarily up-to-date or comprehensive of all medical knowledge, they appear to have been trained on enough data to be useful in the current curriculum, including physiology and diseases required.

Interactive Learning and Real-Time Reasoning

LLMs' dialogue-based approach allows for follow-up questions to simulate and train medics on real-time reasoning expected of them in real scenarios. This can be achieved through interactive practice cases, where LLMs can role-play a simulated patient or examiner to provide an interactive learning environment. This method aligns with how clinical examinations are taught, conducted, and tested, except in a digital format. It should facilitate active learning and engagement with clinical scenarios.2​ The ability to change case information and parameters in real-time allows even more flexibility than was previously possible in simulated patient cases.

Integration into Medical Education and Exam Settings

Moreover, the integration of LLMs into medical education could have broader implications in exam settings. They are capable of generating exam questions and providing supplementary explanations.5​ This would greatly reduce the burden on medical educators by making exam design more efficient and streamlined, though some form of oversight is still required to ensure all generated content is free of misinformation.

Similarly, LLMs often fall short in areas requiring critical thinking, and therefore, the questions posed may often be factual-based rather than the typical exam question that requires a high-level synthesis of different concepts.

Limitations and Challenges

It is essential to acknowledge the limitations of current LLMs in the context of medical examinations. While LLMs demonstrate promising performance, they are not specifically designed to answer medical questions. This limitation may be more apparent in situations requiring long-form answers, where depth is necessary to demonstrate justified clinical decision-making. Additionally, the evolving nature of medical curricula and examination formats poses challenges for LLMs to keep pace with changing requirements. Finally, their reliance on training data means that the inclusion of poor training material would inevitably alter their performance.

Conclusion

Despite limitations, LLMs are promising tools for medical education. The ability of LLMs to pass medical and surgical exams is a significant milestone in the integration of AI technologies into medical education and assessment. Their performance would be further enhanced with additional 'training' by including medical research articles and textbooks.

References

  1. Sallam M. ChatGPT Utility in Healthcare Education, Research, and Practice: Systematic Review on the Promising Perspectives and Valid Concerns. Healthcare [Internet]. 2023 Mar 1 [cited 2024 Apr 20];11(6). Available from: https://doi.org/10.3390/healthcare11060887
  2. Safranek CW, Sidamon-Eristoff AE, Gilson A, Chartash D. The Role of Large Language Models in Medical Education: Applications and Implications. JMIR Med Educ [Internet]. 2023 [cited 2024 Apr 20];9. Available from: https://doi.org/10.2196/50945
  3. Ravi A, Neinstein A, Murray SG. Large Language Models and Medical Education: Preparing for a Rapid Transformation in How Trainees Will Learn to Be Doctors. ATS Sch [Internet]. 2023 [cited 2024 Apr 20];4(3):282-92. Available from: https://doi.org/10.34197/ats-scholar.2023-0036PS
  4. Chan J, Dong T, Angelini GD, Chan J. The performance of large language models in intercollegiate Membership of the Royal College of Surgeons examination.
  5. Mbakwe AB, Lourentzou I, Celi LA, Mechanic OJ, Dagan A. ChatGPT passing USMLE shines a spotlight on the flaws of medical education. PLOS Digit Heal [Internet]. 2023 Feb 9 [cited 2024 Apr 20];2(2):e0000205. Available from: https://doi.org/10.1371/journal.pdig.0000205