Mar / 2026

OncoDaily IO OncoDaily GI Hemostasis Today Fertility News

Blog Posts

Jul 22, 2024 3:48 pm

Which AI does better at USMLE

Photo taken from cnbc.com

Which AI does better at USMLE

Scott Gottlieb, Partner at New Enterprise Associates, shared on X:

“We fed questions from the USMLE Step 3 medical licensing exam to the top 5 LLMs — Google Gemini, ChatGTP, Claude, Llama, and Grok. We wanted to see which LLM has the best medical aptitude. This is how they did.”

Here’s how they scored:

ChatGPT-4o (OpenAI) — 49/50 questions correct (98%)
Claude 3.5 (Anthropic) — 45/50 (90%)
Gemini Advanced (Google) — 43/50 (86%)
Grok (xAI) — 42/50 (84%)
HuggingChat (Llama) — 33/50 (66%)

Source: Scott Gottlieb/X

Blog Posts

cancer chatgtp Claude Google Gemini Grok Llama New Enterprise Associates OncoDaily Oncology Scott Gottlieb USMLE