Roupen Odabashian, Oncologist at Abbotsford Regional Hospital and Cancer Centre, Founder at MeDucation AI, Podcast Host at OncoDaily, shared a post on LinkedIn:
“The benchmark reported 85% diagnostic accuracy. A review of more than 40 peer-reviewed studies found real-world performance to be 52%.
That gap tells an important story about clinical artificial intelligence, and it is easy to overlook if you have never worked in a clinical setting.
Benchmarks evaluate AI using clean, complete, well-labeled cases – the medical equivalent of textbook questions.
Real-world patients are different. They often present with multiple concurrent conditions, complex medication lists, evolving histories, and incomplete medical records. A model that performs exceptionally well on benchmark datasets may perform substantially worse when faced with the complexity of everyday clinical practice.
This is not an argument against clinical AI. It is an argument for evaluating AI where it is actually used, rather than where it performs best.
Every reported performance metric should be accompanied by an important question: accurate on what – the benchmark or the realities of clinical practice?
Until vendors consistently report real-world clinical performance alongside benchmark results, benchmark accuracy should be viewed as an upper limit rather than the level of performance clinicians can expect in routine practice.”

Other articles about AI on OncoDaily.