Jul 22, 2024, 15:48
Which AI does better at USMLE
Scott Gottlieb, Partner at New Enterprise Associates, shared on X:
“We fed questions from the USMLE Step 3 medical licensing exam to the top 5 LLMs — Google Gemini, ChatGTP, Claude, Llama, and Grok. We wanted to see which LLM has the best medical aptitude. This is how they did.”
Here’s how they scored:
- ChatGPT-4o (OpenAI) — 49/50 questions correct (98%)
- Claude 3.5 (Anthropic) — 45/50 (90%)
- Gemini Advanced (Google) — 43/50 (86%)
- Grok (xAI) — 42/50 (84%)
- HuggingChat (Llama) — 33/50 (66%)
Source: Scott Gottlieb/X
-
ESMO 2024 Congress
September 13-17, 2024
-
ASCO Annual Meeting
May 30 - June 4, 2024
-
Yvonne Award 2024
May 31, 2024
-
OncoThon 2024, Online
Feb. 15, 2024
-
Global Summit on War & Cancer 2023, Online
Dec. 14-16, 2023
Nov 23, 2024, 21:16
Nov 23, 2024, 21:00
Nov 23, 2024, 16:54
Nov 23, 2024, 16:30
Nov 23, 2024, 14:48
Nov 23, 2024, 14:32