Gustavo Monnerat, Deputy Editor at The Lancet, shared a post on LinkedIn:

” Friendly AI may carry a hidden accuracy tradeoff.

A new Nature study tested whether training large language models to respond more warmly affects their reliability.

The researchers fine-tuned five LLMs to produce warmer responses, then evaluated them on factual and safety-relevant tasks:

Warm models showed 10–30 percentage-point higher error rates across consequential tasks.
On common falsehoods, warmth fine-tuning increased incorrect answers by 7.43 percentage points.
When users stated an incorrect belief or expressed sadness, the models became more likely to affirm the false belief.

This matters because AI is increasingly used for advice, education, companionship, and health-adjacent support. Warmth and factuality may not be independent design goals.

Caveat: These were controlled experiments on specific models and tasks, not evidence that every friendly chatbot is unreliable.

Should AI evaluation reports include a standard test for uncritical affirmation under user vulnerability before deployment in clinical or educational settings?

Ref: Ibrahim, Nature, 2026.”

Title: Friendlier LLMs tell users what they want to hear – even when it is wrong

Author: Desmond Ong

Read the article.

Other articles featuring Gustavo Monnerat on OncoDaily.

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Gustavo Monnerat: Friendly AI May Carry a Hidden Accuracy Tradeoff

Title: Friendlier LLMs tell users what they want to hear – even when it is wrong

European School of Oncology

Sitemap

Hemostasis Today

Fertility News

Oncodaily Journal