Voices

Olivier Elemento: A Blueprint for AI Benchmarks in Cancer Research

Olivier Elemento/LinkedIn

Aug 17, 2025, 10:22

Olivier Elemento: A Blueprint for AI Benchmarks in Cancer Research

Olivier Elemento, Director, Englander Institute for Precision Medicine at Weill Cornell Medicine, shared on LinkedIn:

“A Blueprint for AI Benchmarks in Cancer Research

The National Cancer Institute (NCI) recently put out a Request for Information (RFI) on the need for benchmarks to evaluate Artificial Intelligence in cancer research and care. This is a topic of immense importance, and I was pleased to contribute my perspective (see attached).

The core of my response is built on the idea that to truly harness the power of AI in oncology, we need robust, high-quality benchmarks. Beyond simply assessing algorithms, these benchmarks are fundamental to ensuring that AI tools are accurate, reliable, and equitable when applied in real-world scenarios.

I think there are several areas where benchmarks are urgently needed:

Fundamental Discovery: To help generate and validate novel hypotheses about cancer mechanisms and drivers.

Early Detection & Diagnosis: For the automated detection and characterization of lesions in medical imaging and the interpretation of complex molecular data.

Personalized Treatment: To predict patient response to therapies and automate the matching of patients to eligible clinical trials.

Operational Efficiency: For streamlining clinical workflows, such as generating clinical notes and extracting structured information from EHRs.

In my submission, I also outlined a blueprint for what these essential resources should look like:

Quality and Representativeness
The foundation is the data itself. Datasets must be expertly annotated and reflect the full diversity of the patient population to minimize bias. For clinical research, it is important to include comprehensive, multi-modal data—like imaging, genomics, and clinical notes—from the same patients over time. For basic cancer research, data should be derived from more physiologically relevant samples, such as primary tumors or tumor organoids, to better reflect in vivo biology.

Utility and Impact
Benchmarks require clear problem definitions and objective scoring metrics to be useful. They should be designed to tackle high-impact questions that can lead to significantly better patient outcomes or accelerate discovery. Furthermore, they must help us evaluate the transparency and interpretability of AI models so clinicians can understand the “why” behind a conclusion.

Availability and Accessibility
Data ecosystems should be built on federated and secure principles, allowing models to be tested on sensitive data without centralizing it. All benchmark data should adhere to FAIR principles (Findable, Accessible, Interoperable, and Reusable) and be continuously maintained. To ensure unbiased validation, these benchmark datasets must be kept completely separate from the data used for initial model training.

This framework is ambitious but necessary to build a future where AI can be safely and effectively integrated into oncology.”

More posts featuring Olivier Elemento.