Gustavo Monnerat: 63 Ways to Measure Clinical AI Fairness, but Only One Focuses on Clinical Utility
Gustavo Monnerat/LinkedIn

Gustavo Monnerat: 63 Ways to Measure Clinical AI Fairness, but Only One Focuses on Clinical Utility

Gustavo Monnerat, Deputy Editor at The Lancet, shared a post on LinkedIn:

63 ways to measure clinical AI fairness. Only 1 focused on clinical utility.

Predictive AI is moving into clinics fast, but the field still can’t agree on what a ‘fair’ model even means, and that definition might shape whether these tools narrow health gaps or widen them.

A new Lancet Digital Health scoping review screened 820 records, included 42 studies, and identified 63 distinct fairness metrics for clinical prediction models.

  • Only 19 of the 63 metrics were built for healthcare, and just one (subgroup net benefit) measures whether decisions actually help or harm patients.
  • 48 of 63 metrics depend on model performance and 33 of 63 hinge on an often-arbitrary decision threshold.

Equalizing metrics across groups does not necessarily improve fairness. The authors push for a different goal: a minimum acceptable performance for every group, tied to patient outcomes. Worth noting this is a qualitative review, English-only articles.

What evidence would you require before calling a clinical AI model ‘fair enough’ for practice?”

Title: Critical appraisal of fairness metrics for artificial intelligence-based clinical prediction models: a scoping review

Authors: João Matos, Ben Van Calster, Leo Anthony Celi, Paula Dhiman, Judy Wawira Gichoya, Richard D. Riley, Chris Russell, Sara Khalid, Gary S. Collins

Read the Full Article.

Gustavo Monnerat: 63 Ways to Measure Clinical AI Fairness, but Only One Focuses on Clinical Utility

Other articles featuring Gustavo Monnerat on OncoDaily.