HOT
Sharecaster
No Result
View All Result
Sharecaster
No Result
View All Result
Advertisement Banner
Home Technology

Comparative performance of ChatGPT-4o, ChatGPT-5, and gemini 2.5 flash on Persian internal medicine subspecialty board exams

Sharecaster by Sharecaster
March 2, 2026
in Technology
394 4
0
Comparative performance of ChatGPT-4o, ChatGPT-5, and gemini 2.5 flash on Persian internal medicine subspecialty board exams
548
SHARES
2.5k
VIEWS
Share on FacebookShare on Twitter


  • Abd-Alrazaq, A. et al. Large Language models in medical education: Opportunities, Challenges, and future directions. JMIR Med. Educ. 9, e48291 (2023).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Elzayyat, M., Mohammad, J. N. & Zaqout, S. Assessing LLM-generated vs. expert-created clinical anatomy mcqs: a student perception-based comparative study in medical education. Med. Educ. Online. 30 (1), 2554678 (2025).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Ling, Q. et al. Assessing the possibility of using large Language models in ocular surface diseases. Int. J. Ophthalmol. 18 (1), 1–8 (2025).

    Article 
    PubMed 

    Google Scholar
     

  • Gotta, J. et al. Large Language models (LLMs) in radiology exams for medical students: performance and consequences. Rofo 197 (9), 1057–1067 (2025).

    Article 
    PubMed 

    Google Scholar
     

  • Bahir, D. et al. Gemini AI vs. ChatGPT: A comprehensive examination alongside ophthalmology residents in medical knowledge. Graefes Arch. Clin. Exp. Ophthalmol. 263 (2), 527–536 (2025).

    PubMed 

    Google Scholar
     

  • Al-Thani, S. N. et al. Comparative performance of ChatGPT, Gemini, and final-year emergency medicine clerkship students in answering multiple-choice questions: implications for the use of AI in medical education. Int. J. Emerg. Med. 18 (1), 146 (2025).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Fattah, F. H. et al. Comparative analysis of ChatGPT and gemini (Bard) in medical inquiry: a scoping review. Front. Digit. Health. 7, 1482712 (2025).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Wu, X. et al. A multi-dimensional performance evaluation of large Language models in dental implantology: comparison of ChatGPT, DeepSeek, Grok, gemini and Qwen across diverse clinical scenarios. BMC Oral Health. 25 (1), 1272 (2025).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Singal, A. & Goyal, S. Comparative evaluation of AI platforms Google gemini 2.5 Flash, Google gemini 2.0 Flash, deepseek V3 and ChatGPT 4o in solving multiple-choice questions from different subtopics of anatomy. Surg. Radiol. Anat. 47 (1), 193 (2025).

    Article 

    Google Scholar
     

  • Masalkhi, M., Ong, J., Waisberg, E. & Lee, A. G. Google deepmind’s gemini AI versus chatgpt: a comparative analysis in ophthalmology. Eye (Lond). 38 (8), 1412–1417 (2024).

    Article 

    Google Scholar
     

  • Marey, A. et al. Evaluating the accuracy and reliability of AI chatbots in patient education on cardiovascular imaging: a comparative study of ChatGPT, gemini, and copilot. Egypt. J. Radiol. Nuclear Med. 56 (1), 37 (2025).

    Article 

    Google Scholar
     

  • Bayala, Y. L. T. et al. Performance of the large Language models in African rheumatology: a diagnostic test accuracy study of ChatGPT-4, Gemini, Copilot, and Claude artificial intelligence. BMC Rheumatol. 9 (1), 54 (2025).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Madrid-García, A. et al. Harnessing ChatGPT and GPT-4 for evaluating the rheumatology questions of the Spanish access exam to specialized medical training. Sci. Rep. 13 (1), 22129 (2023).

    Article 
    PubMed 
    PubMed Central 
    ADS 

    Google Scholar
     

  • Meral, G., AteÅŸ, S., Günay, S., Öztürk, A. & KuÅŸdoÄŸan, M. Comparative analysis of ChatGPT, gemini and emergency medicine specialist in ESI triage assessment. Am. J. Emerg. Med. 81, 146–150 (2024).

    Article 
    PubMed 

    Google Scholar
     

  • Lee, J. T. et al. Evaluation of performance of generative large Language models for stroke care. Npj Digit. Med. 8 (1), 481 (2025).

    Article 
    PubMed 
    ADS 

    Google Scholar
     

  • Bahir, D. et al. Gemini AI vs. ChatGPT: A comprehensive examination alongside ophthalmology residents in medical knowledge. Graefe’s Archive Clin. Experimental Ophthalmol. 263 (2), 527–536 (2025).


    Google Scholar
     

  • Sabaner, M. C. & Yozgat, Z. Performance of ChatGPT-4 omni and gemini 1.5 pro on Ophthalmology-Related questions in the Turkish medical specialty exam. Turk. J. Ophthalmol. 55 (4), 177–185 (2025).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Hernández-Flores, L. A. et al. Assessment of challenging oncologic cases: A comparative analysis between ChatGPT, Gemini, and a multidisciplinary tumor board. J. Surg. Oncol. 131 (8), 1562–1570 (2025).

    Article 
    PubMed 

    Google Scholar
     

  • Oh, N., Choi, G. S. & Lee, W. Y. ChatGPT goes to the operating room: evaluating GPT-4 performance and its potential in surgical education and training in the era of large Language models. Ann. Surg. Treat. Res. 104 (5), 269–273 (2023).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Wang, T. et al. Evaluating the performance of State-of-the-Art artificial intelligence chatbots based on the WHO global guidelines for the prevention of surgical site infection: Cross-Sectional study. J. Med. Internet Res. 27, e75567 (2025).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Li, H. et al. ChatGPT-4o outperforms gemini advanced in assisting multidisciplinary decision-making for advanced gastric cancer. Eur. J. Surg. Oncol. 51 (8), 110096 (2025).

    Article 
    PubMed 

    Google Scholar
     

  • Lin, C. R. et al. Multiple large Language models versus clinical guidelines for postmenopausal osteoporosis: a comparative study of ChatGPT-3.5, ChatGPT-4.0, ChatGPT-4o, Google gemini, Google gemini Advanced, and Microsoft copilot. Arch. Osteoporos. 20 (1), 120 (2025).

    Article 

    Google Scholar
     

  • Liu, R., Liu, J., Yang, J., Sun, Z. & Yan, H. Comparative analysis of ChatGPT-4o mini, ChatGPT-4o and gemini advanced in the treatment of postmenopausal osteoporosis. BMC Musculoskelet. Disord. 26 (1), 369 (2025).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Muluk, E. A comparative analysis of artificial intelligence platforms: ChatGPT-4o and Google gemini in answering questions about birth control methods. Cureus 17 (1), e76745 (2025).

    PubMed Central 

    Google Scholar
     

  • McNulty, A. M. et al. Performance evaluation of ChatGPT-4.0 and gemini on image-based neurosurgery board practice questions: A comparative analysis. J. Clin. Neurosci. 134, 111097 (2025).

    Article 
    PubMed 

    Google Scholar
     

  • Sau, S. et al. Accuracy and quality of ChatGPT-4o and Google gemini performance on image-based neurosurgery board questions. Neurosurg. Rev. 48 (1), 320 (2025).

    Article 
    PubMed 

    Google Scholar
     

  • Ali, R. et al. Performance of ChatGPT and GPT-4 on neurosurgery written board examinations. Neurosurgery 93 (6), 1353–1365 (2023).

    Article 
    PubMed 

    Google Scholar
     

  • Huang, K. A., Choudhary, H. K., Hardin, W. M. & Prakash, N. Comparative analysis of ChatGPT-4o and gemini advanced performance on diagnostic radiology In-Training exams. Cureus 17 (3), e80874 (2025).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • Clark, K. R. Comparative analysis of llms’ performance on a practice radiography certification exam. Radiol. Technol. 96 (5), 334–342 (2025).

    Article 

    Google Scholar
     

  • Suh, P. S. et al. Comparing diagnostic accuracy of radiologists versus GPT-4V and gemini pro vision using image inputs from diagnosis please cases. Radiology 312 (1), e240273 (2024).

    Article 
    PubMed 

    Google Scholar
     

  • Jain, S., Chakraborty, B., Agarwal, A. & Sharma, R. Performance of large Language models (ChatGPT and gemini Advanced) in Gastrointestinal pathology and clinical review of applications in gastroenterology. Cureus 17 (4), e81618 (2025).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • Khan, A. A. et al. Artificial intelligence for anesthesiology Board-Style examination questions: role of large Language models. J. Cardiothorac. Vasc Anesth. 38 (5), 1251–1259 (2024).

    Article 

    Google Scholar
     

  • Passby, L., Jenko, N. & Wernham, A. Performance of ChatGPT on specialty certificate examination in dermatology multiple-choice questions. Clin. Exp. Dermatol. 49 (7), 722–727 (2024).

    Article 
    PubMed 

    Google Scholar
     

  • Dhanvijay, A. K. D. et al. Performance of large Language models (ChatGPT, Bing Search, and Google Bard) in solving case vignettes in physiology. Cureus 15 (8), e42972 (2023).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • Kumari, A. et al. Large Language models in hematology case solving: A comparative study of ChatGPT-3.5, Google Bard, and Microsoft Bing. Cureus 15 (8), e43861 (2023).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • Cheong, R. C. T. et al. Performance of artificial intelligence chatbots in sleep medicine certification board exams: ChatGPT versus Google bard. Eur. Arch. Otorhinolaryngol. 281 (4), 2137–2143 (2024).

    Article 
    PubMed 

    Google Scholar
     

  • Makrygiannakis, M. A., Giannakopoulos, K. & Kaklamanos, E. G. Evidence-based potential of generative artificial intelligence large Language models in orthodontics: a comparative study of ChatGPT, Google Bard, and Microsoft Bing. Eur. J. Orthod. (2024).

  • Yamaguchi, S. et al. Evaluating the efficacy of leading large Language models in the Japanese National dental hygienist examination: A comparative analysis of chatGPT, Bard, and Bing chat. J. Dent. Sci. 19 (4), 2262–2267 (2024).

    Article 
    PubMed 

    Google Scholar
     

  • Fukuda, H. et al. Evaluating the image recognition capabilities of GPT-4V and gemini pro in the Japanese National dental examination. J. Dent. Sci. 20 (1), 368–372 (2025).

    Article 
    PubMed 

    Google Scholar
     

  • Khan, M. P. & O’Sullivan, E. D. A comparison of the diagnostic ability of large Language models in challenging clinical cases. Front. Artif. Intell. 7, 1379297 (2024).

    Article 
    PubMed 

    Google Scholar
     

  • Khan, A. A. et al. Assessing the performance of ChatGPT in medical ethical decision-making: a comparative study with USMLE-based scenarios. J. Med. Ethics (2025).

  • Lee, Y. et al. Harnessing artificial intelligence in bariatric surgery: comparative analysis of ChatGPT-4, Bing, and bard in generating clinician-level bariatric surgery recommendations. Surg. Obes. Relat. Dis. 20 (7), 603–608 (2024).

    Article 
    PubMed 

    Google Scholar
     

  • Lee, Y. et al. Performance of artificial intelligence in bariatric surgery: comparative analysis of ChatGPT-4, Bing, and bard in the American society for metabolic and bariatric surgery textbook of bariatric surgery questions. Surg. Obes. Relat. Dis. 20 (7), 609–613 (2024).

    Article 
    PubMed 

    Google Scholar
     

  • Anvari, S., Lee, Y., Jin, D. S., Malone, S. & Collins, M. Artificial intelligence in hepatology: a comparative analysis of ChatGPT-4, Bing, and bard at answering clinical questions. J. Can. Assoc. Gastroenterol. 8 (2), 58–62 (2025).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Zare, S. et al. Comparing the performance of ChatGPT-3.5-Turbo, ChatGPT-4, and Google bard with Iranian students in pre-internship comprehensive exams. Sci. Rep. 14 (1), 28456 (2024).

    Article 
    CAS 
    PubMed 
    ADS 

    Google Scholar
     

  • Roos, J., Martin, R. & Kaczmarczyk, R. Evaluating bard gemini pro and GPT-4 vision against student performance in medical visual question answering: comparative case study. JMIR Form. Res. 8, e57592 (2024).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Meo, S. A., Abukhalaf, F. A., ElToukhy, R. A. & Sattar, K. Exploring the role of DeepSeek-R1, ChatGPT-4, and Google gemini in medical education: how valid and reliable are they? Pak J. Med. Sci. 41 (7), 1887–1892 (2025).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Omar, M. et al. Generating credible referenced medical research: A comparative study of openai’s GPT-4 and google’s gemini. Comput. Biol. Med. 185, 109545 (2025).

    Article 

    Google Scholar
     

  • Guidance, W. Ethics and Governance of Artificial Intelligence for Health (World Health Organization, 2021).

  • Asgari, E. et al. A framework to assess clinical safety and hallucination rates of LLMs for medical text summarisation. Npj Digit. Med. 8 (1), 274 (2025).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     



  • Source link

    Advertisement Banner
    Sharecaster

    Sharecaster

    Amplify Your Content. Broadcast multimedia posts to every social platform—all from one profile.

    Categories

    • Animals
    • Buzz
    • Celebs
    • Life
    • Tech
    • Technology
    • Uncategorized
    • Video

    Tags

    Art Entertainment Funny Health News Split Post Viral

    Newsletter

    • About
    • Advertise
    • Privacy & Policy
    • Contact Us

    © 2026 JNews - Premium WordPress news & magazine theme by Jegtheme.

    Welcome Back!

    Login to your account below

    Forgotten Password?

    Retrieve your password

    Please enter your username or email address to reset your password.

    Log In
    No Result
    View All Result
    • Home
    • Contact Us
    • Amplify
      • Multi-Platform Sharing
      • Content Optimization
      • Viral Boost Tools
    • Analytics+
      • Audience Insights
      • Engagement Metrics
      • Performance Reports
    • Broadcast Studio
      • Story Creator
      • Live Streaming
      • Multimedia Posts
    • How It Works
      • FAQ
      • Case Studies
      • Tutorials
    • Pricing
    • Profile Hub
      • Unified Dashboard
      • Custom Branding
      • Account Management
    • Sync Everywhere
      • Platform Integrations
      • One-Click Scheduling
      • Auto-Post to All Networks
    • Technology

    © 2026 JNews - Premium WordPress news & magazine theme by Jegtheme.

    Go to mobile version