Douglas Flora, Executive Medical Director of Yung Family Cancer Center at St. Elizabeth Healthcare, President-Elect of the Association of Cancer Care Centers, and Editor in Chief of AI in Precision Oncology, shared a post on LinkedIn:
“The Centaur Compact: What Kasparov’s Defeat Taught Us About the Doctor We Need Next
Where is the wisdom we have lost in knowledge? Where is the knowledge we have lost in information?
– T. S. Eliot, ‘The Rock’ (1934)
In 1909, a Louisville schoolmaster named Abraham Flexner set out to inspect the medical schools of the United States and Canada. He carried a notebook, a Carnegie Foundation expense account, and a degree in classics. He had no medical training of his own. That fact has been used against him ever since, and it was probably his greatest asset. Over the eighteen months that followed, he visited 155 institutions, and he found, in many of them, what he later described with dry horror: schools without laboratories, without libraries, without cadavers; ‘diploma mills’ whose proprietors collected tuition and conferred degrees with roughly the ceremony of a hardware sale. He smelled formaldehyde in some places. In others, only kerosene and damp wool. He took notes the way an entomologist takes notes, exactly, dispassionately, and without sentimentality about what he was destroying.
The report he published in 1910 went far beyond recommending reform. It eliminated roughly two-thirds of the country’s medical schools over the next two decades, standardized what remained around basic science, and embedded at the center of American physician training an assumption so deep that we have largely forgotten it is an assumption at all: that to heal, one must first know. To hold the body’s machinery inside the mind, retrievable and precise and complete, before laying hands on a patient. To know anatomy the way a cartographer knows a coastline, pathology the way a detective knows motive. To carry, in a cognitive ledger, the named diseases of the world.
That assumption is now being quietly dismantled by machines.
The evidence is no longer subtle. Large language models recite the causes of acute pancreatitis faster than any second-year resident, reproduce the diagnostic criteria for systemic lupus with perfect fidelity, and synthesize a morning’s worth of laboratory data in the time it takes to pour coffee. Systems that can reason across tasks, call external tools, and act on their conclusions without human prompting at each step are moving out of research demonstrations and into clinical workflows faster than institutions can absorb them. The vocabulary of artificial general intelligence, until recently the property of alignment researchers, has entered the clinical literature. Whether that arrives in five years or in fifteen is genuinely contested, sometimes fiercely; the direction is not. Information itself is becoming nearly free. That condition is the ground we now stand on, and it asks something uncomfortable of medical education. The live question is no longer only what we teach medical students. It is who we admit, and why.
The precedent comes from chess. In 1997, the world champion Garry Kasparov sat down across a board from a machine and lost. The defeat looked, at the time, like an ending. What grew from it was a stranger kind of contest, human and machine playing as one, a pairing that came to be called the centaur.
Two winters ago, I cared for a woman I will call Mrs. K. She was sixty-three when I met her, a retired librarian from a small town in Northern Kentucky, the kind of patient whose appointments arrive with typed questions in a folder. She had gone to her internist with two months of vague upper-abdominal pain and fifteen pounds of weight loss she had attributed to grief; her husband had died that autumn. A CT scan changed the explanation. Within a week, she was sitting across from me, the scan on the screen showing a four-centimeter mass in the head of the pancreas and three lesions in the liver. Adenocarcinoma. Metastatic at presentation. The five-year survival, even now, sits in the single digits.
I remember the details of that appointment vividly. She had brought her daughter. She had brought the folder. She had also brought, as I have come to recognize, an expression I have never quite learned how to meet, the expression of a person who has already done the math and is waiting only to have it confirmed.
I had, at that visit, the full machinery of a modern oncology practice. The electronic record had assembled her history before I opened the door. Her tumor would be sequenced within days, its molecular profile run against the catalog of open trials. And I had begun, by then, to consult a large language model the way one consults a brilliant and unpredictable colleague. In her case, it synthesized the literature and proposed a treatment sequence that I, after my own review, judged to be approximately right.
None of that helped me with the appointment. None of it told me when, in the conversation, to stop talking. None of it told me whether to use the word metastatic in the first ten minutes or the last five, whether to answer the daughter’s question about a granddaughter’s wedding in October before or after I answered Mrs. K.’s question about hospice, whether the most honest thing I could offer her in that first encounter was the truth or the time to absorb the truth. I still cannot resolve the last one.
The algorithm did not know Mrs. K. The algorithm could not, in the most precise and morally significant sense of the word, see her.
The MCAT has always been, at its core, a test of retention and pattern recognition under pressure, a proxy for the cognitive machinery medicine once required in abundance and in many domains still does. When a physician in 1975 needed to recall, unaided and in real time, the differential diagnosis for a hypercalcemic patient in extremis, rapid recall was more than a sign of intelligence. It was a clinical survival skill. The test measured something real. It still does. What it measures is becoming less essential.
The Greek diagnōsis is built from dia, meaning through or between, and gignōskein, to know. To diagnose is to know between: to distinguish this from that, the lupus flare from the medication effect, the pulmonary embolus from the panic attack. For most of the history of medicine, this has been the labor at which physicians were hardest tested and most reverently judged.
That labor is now being absorbed, case by case, by machines that take no pains and feel no fatigue. The electronic record prompts the exhausted intern on a dose. The radiomics algorithm flags the ground-glass opacity that a tired eye would likely miss at midnight. The language model carries the harried hospitalist through a rare-disease presentation he will meet once in a career, if at all. The diagnostic act is being unbundled, its components handed one by one to systems that perform them better than we do, with less drama and without billing.
What the machines cannot do, what no algorithm has yet approximated and some of us suspect no algorithm ever will, is sit in a room with Mrs. K. and know, without calculation, when to speak and when to be still.
As cognitive labor becomes automatable, the comparative advantage of a human clinician will lie in whatever resists automation. Moral reasoning under genuine uncertainty. Attunement to suffering that defies measurement. The capacity to build trust across the particular texture of one person’s life. Eric Cassell, in his book on the nature of suffering, argued that the relief of suffering is medicine’s first responsibility and its most demanding. He meant something exacting. Suffering, he wrote, is something that happens to persons. Bodies feel pain; persons suffer. It cannot be inferred from a laboratory value or pattern-matched from a database. It has to be witnessed.
Bernard Lown, the cardiologist who accepted the Nobel Peace Prize on behalf of International Physicians for the Prevention of Nuclear War, used to tell his students that the word doctor descends from the Latin docere, to teach. The physician’s first instrument, he insisted, was language. The second was silence.
In May of 1997, in a thirty-fifth-floor room of the Equitable Center in Manhattan, Garry Kasparov, perhaps the strongest player the game had produced, resigned the sixth and final game of his match against IBM’s Deep Blue. He sat at the board a long moment, by his own account, then stood, then left. What Kasparov did in the years after that defeat drew far less attention. It would change the game.
A new kind of competition emerged, called Advanced Chess and later Centaur Chess, in which human players were allowed to consult computers in real time during a match. Kasparov championed it himself. The results overturned the obvious assumption about where the advantage lay. Raw computing power mattered less than the human’s skill in using it. The strongest teams were the ones in which the player knew when to trust the machine’s evaluation, when to override it, when the position had grown complex enough that human intuition outran the calculation, and when the machine, however confident, had been seduced by some local feature it lacked the imagination to discount. A skilled amateur with a computer could beat a grandmaster who used one poorly.
The best centaurs, Kasparov later wrote, were the ones who had developed a process. They knew, at each move, which kind of mind, silicon or carbon, was better suited to the question in front of them. And they knew when they did not know.
Medicine is entering its centaur period.
The physician of 2030 will practice at this same interface, one hand on the patient and one eye on the output of systems that will, in many domains, hold more discrete facts than any human could. What will separate the excellent clinician from the adequate one is the quality of judgment: knowing when the algorithm is right and when it is wrong, when the patient in the room belongs to the three percent the training data never represented, and when the right answer sits outside the differential entirely, in a question nobody has thought to ask. That kind of judgment does not show up on the MCAT, and an MCAT score does not predict it. We have built our entire selection system on the premise that it does.
Picture the applicant a medical school should be looking for in the class of 2030. Its members will practice into the 2050s, in a clinical world whose contours are not yet visible from here. Biochemistry and physiology still matter; fluency in them is a floor the machine raises rather than removes. But the qualities that should weigh most heavily in an admission are the ones medicine has long treated as secondary, for the plain reason that they were difficult to measure. They were never secondary. In the medicine that is arriving, they may be the primary thing.
Start with curiosity, and not the performed curiosity of a personal statement built to impress a committee. The real thing is structural, the standing habit of a person who cannot encounter a phenomenon without needing to understand it. The college student who reads the footnote. Who follows the citation into PubMed at eleven at night. Who stays after the lecture to ask the question that surfaced halfway through and will not let go. Lewis Thomas, the physician-essayist who presided over Memorial Sloan Kettering in the 1970s and wrote some of the most luminous essays on biology in the language, described the scientist’s vocation as a permanent vulnerability to the word why, a cognitive itch that no single answer relieves. Curiosity of that kind is observable. It leaves tracks. Admissions committees have mostly chosen not to look for them.
Curiosity that survives contact with difficulty becomes something harder to name. Earnestness, maybe? The willingness to take seriously what deserves to be taken seriously, without irony, without the protective detachment that medical culture has long rewarded and, in some quarters, still demands. Earnestness is unfashionable now in a way it has not been in living memory. In a physician, it remains irreplaceable. The cynic makes a serviceable internist for the first decade and an empty one for the two that follow.
Earnestness describes a disposition. It says little about what happens when the room turns, when a result comes back wrong, or a family’s hope outruns the evidence. That is a separate capacity, and a harder one to name, though every clinician recognizes it at a glance. Confronted with another person’s fear, some people grow more focused, and others grow more defensive, and medical school does not address the difference. It reveals it. The resident standing at three in the morning at the foot of a bed where a patient is dying, with the family watching her face to learn what is happening, either has this steadiness or does not, and everyone in the room will know which within half a minute. William Osler named the quality more than a century ago. In his 1889 farewell address to the medical graduates of the University of Pennsylvania, he called it aequanimitas, the imperturbability that keeps a physician’s judgment clear while the situation around it comes apart, and he thought it the essential medical virtue. Whether our admissions process leaves room for aequanimitas to surface, or screens so heavily for performance and credentials that it is filtered out before anyone sees it, is a real question. I suspect we filter it out.
Steadiness keeps a clinician working in a hard room. It does not, by itself, tell her what the right thing to do there is. For that, she needs ethical seriousness. Not the four principles every applicant has memorized for the interview (autonomy, beneficence, non-maleficence, justice), but the slower and more demanding thing those principles are meant to produce and often do not: the practiced willingness to sit inside a genuinely hard question, resist the pull toward premature resolution, and reason to an answer with clear eyes about its cost. In a clinical environment where algorithmic recommendations arrive with the authority of institutional consensus and are, far more often than not, correct, that willingness will be tested daily. A physician who cannot push back on a confident system, carefully and with evidence, and with a real readiness to be wrong, cannot advocate for the patient that the system failed to anticipate. There is always such a patient. Mrs. K. was one.
None of these qualities is captured by a biochemistry subscore. Some can be partially assessed through structured situational judgment, through sustained observation in clinical and community settings, through careful reading of writing that was never designed to impress. But only if admissions committees decide the assessment is worth its uncertainty, and that decision has not yet been made at scale.
The objection writes itself: a process this consequential cannot afford subjectivity. But medical admissions already run on enormous subjectivity. We have displaced it onto proxies that feel objective, such as standardized test scores, grade-point averages, and research publications, and told ourselves the result is meritocratic. The MCAT identifies people who are good at the MCAT. The research publication identifies people with access to research mentorship, which tracks closely with institutional privilege and family income and only loosely with clinical promise. The published correlation between MCAT score and the downstream measures of clinical excellence we claim to value, patient outcomes, communication and professionalism, and longevity in the field, is modest by every honest measure. The process does not remove bias from admissions; it encodes the bias and then mistakes the encoding for rigor.
The immune system runs a process worth borrowing from. Early in life, the thymus performs what immunologists call negative selection: it exposes each developing T cell to fragments of the body’s own proteins, and the cells that bind too strongly to self, the autoreactive ones that would attack the host if they were ever released, are pruned. What survives is a repertoire calibrated for the real work of immunity. Admissions should work the same way. It should expose candidates to the conditions medicine will actually demand of them and keep the ones who can do that work. We have built an apparatus that selects, instead, for performance on the test of getting in. We are pruning the wrong cells.
Standards matter. The question is which ones predict what we actually care about: physicians who will be trusted, effective, and durable across a thirty-year career, in a field undergoing its deepest reorganization since Flexner. The centaur tournaments showed the failure from the other direction. Strong players lost, grandmasters lost, when they overdeferred to the machine, feeding it a position and playing whatever it returned without interrogating the recommendation, without understanding the logic well enough to notice when it broke. Their games were technically clean and strategically thin. A medicine built around its best memorizers risks the same outcome: clinicians who are technically credentialed and strategically hollow, unequipped for the encounters that matter most.
For those who run training programs in internal medicine and surgery, psychiatry, and primary care, and in every discipline in between, the implication is at once uncomfortable and clarifying. The work ahead for these students has changed shape. Recalling a disease’s diagnostic criteria under pressure is now something the machine does with perfect fidelity and zero fatigue. The harder work is integrating the machine’s certainty with a patient’s ambivalence, navigating the distance between what an algorithm recommends and what a person can bear, holding the clinical courage to depart from a protocol when the patient in the room is not the patient the training data anticipated, and recognizing the patient as a patient: a person whose suffering is not reducible to the disease that occasions it.
Plutarch, in the essay ‘On Listening,’ wrote that the mind is not a vessel to be filled but a fire to be kindled. The line has been quoted to death, including by me. It has, even so, taken on a new precision. The vessel is no longer ours to fill; the machines, with their oceanic memories and their tireless retrieval, will fill it more completely than we ever could. What remains for us is the fire. It may always have been the deeper work, obscured for a century by our pride in the vessel.
Mrs. K. lived another fourteen months. She made it to her granddaughter’s wedding. She did not make it to the next spring. At the last appointment I had with her, she asked me a question I had not been asked before, which was whether I thought she had done a good job of dying. I closed the laptop. I had been finishing a note. I am not sure, even now, that I gave her an adequate answer.”

Other articles featuring Douglas Flora on OncoDaily.