Olivier Elemento, Director of Englander Institute for Precision Medicine at Weill Cornell Medicine, shared a post on LinkedIn:
“Why AI models don’t make it to clinical practice.
Clinical AI’s challenge isn’t just getting deployed – it’s being used. The Epic Sepsis Model runs in hundreds of hospitals, but external validation found it missing 67% of sepsis cases while generating false alerts on 18% of patients. This pattern isn’t unique to Epic. Across clinical AI, the gap between deployment and meaningful use comes down to (at least) five questions:
“Show me the prospective data”
Retrospective validation is table stakes. A strong AUC on held-out data proves pattern recognition on historical data. What clinicians need: does it improve patient outcomes in their actual workflow? The Epic Sepsis Model illustrates this gap: internal validation showed strong performance, but external validation found AUC dropping to 0.63-missing 67% of sepsis cases. Despite updates, a 2024 study found just 15% sensitivity in real-world use.
“How does it fail?”
No model is perfect. Clinicians need to know how it fails. Does it work on post-surgical patients with drains and lines? On chronic complex cases with multiple comorbidities? What they need is transparency about limitations so they can use tools appropriately.
“Where does the liability sit?”
When AI suggests a diagnosis, who’s responsible if it’s wrong? If a clinician follows the AI and the patient has a bad outcome, are they liable for blindly trusting it? If they override the AI and miss something, are they liable for ignoring it? Many won’t touch AI systems without clear institutional policies. Legal ambiguity alone prevents adoption.
“Will this make my job harder?”
Every new system adds cognitive load. If AI saves 30 seconds per patient but requires 2 minutes of documentation and three extra clicks, it makes the day worse, not better. Systems that succeed integrate seamlessly-minimal training, reduced work, not new tasks.
“Does it match how clinicians actually work?”
The best clinical AI systems mirror clinical reasoning-integrating multiple data sources and weighting information based on clinical relevance, not statistical correlation. Echo Prime, a cardiac imaging model I wrote about recently, learned to weight echo views the same way cardiologists do-without being explicitly taught. When AI logic aligns with expert clinical thinking, trust builds naturally. When it feels like a black box producing numbers, adoption stalls.
The path forward
Some models already work-automated coronary calcium scoring, stroke routing protocols-because they answer these questions.
What broader clinical AI requires:
→ Prospective validation showing improved outcomes
→ Clear liability frameworks
→ Seamless workflow integration
→ Transparent failure documentation
→ Designs mirroring clinical reasoning
Understanding that clinicians resist poorly integrated technology-not AI itself-will determine which models make it to practice.”
More posts featuring Olivier Elemento on OncoDaily.