Spencer Dorn, Vice Chair & Professor of Medicine, University of North Carolina at Chapel Hill, shared a post on LinkedIn about an article Paul J. Lukac and colleagues authored:
“A recently published RCT showed AI scribes saved UCLA physicians just 20 seconds per note – not exactly revolutionary.
And yet, we all know physicians who love their AI scribes. Many say they’re no longer writing notes after dinner. Some say the tools helped keep them from leaving medicine.
So what explains the disconnect?
Well-designed RCTs like this one use intention-to-treat analyses. That means outcomes are averaged across all randomized clinicians, including those who barely used their scribes or didn’t use them at all (and therefore couldn’t benefit).
Because I kept wondering about this, I contacted the study author, who pointed me to the supplementary appendix. As you can see in the figure below, the signal there is clearer: higher Nabla and DAX users experienced more time savings, less burnout, lower task load, and less work exhaustion. (Correlation coefficients were in the -.3 to -.4 range)
In other words: those who used the tools benefited. Why else would they keep using them?
This is true of almost any tool.
As a gastroenterologist:
- Give me a dermatoscope → it’ll sit on my desk.
- Give me a fancy stethoscope → I’ll use it occasionally.
- Give me an endoscope that helps me better identify abnormalities → I’ll probably both use it and benefit a lot.
Does this mean we should push AI scribe use? Maybe just enough to let clinicians figure out if, when, and how it helps them.
The bigger point: we shouldn’t expect any tool – AI or otherwise – to benefit everyone equally. And RCT averages, while essential, don’t always tell the whole story.”
Title: Ambient AI Scribes in Clinical Practice: A Randomized Trial
Authors: Paul J. Lukac, William Turner, Sitaram Vangala, Aaron T. Chin, Joshua Khalili, Ya-Chen Tina Shih, Catherine Sarkisian, Eric M. Cheng, John N. Mafi
