SYNTHESIS NOTE
Training, RL, and Test-Time Scaling Reasoning, Retrieval, and Evaluation Psychology, Society, and Alignment

Can models express calibrated confidence in long-form text?

Can language models be trained to emit extended passages with confidence statements that actually help readers make accurate probabilistic predictions? This matters because confident hallucinations mislead users into bad decisions.

Synthesis note · 2026-06-03 · sourced from Reinforcement Learning

Confident hallucinations lead users to confidently bad decisions, and existing models can't emit long-form text with calibrated confidence. This work defines linguistic calibration through the lens of decision-making: an LM is linguistically calibrated if its generations enable users to make calibrated probabilistic predictions about the world. That definition yields a clean training framework — an SFT step bootstraps the model to emit long-form text with confidence statements ("I estimate a 30% chance of..."; "I am certain that..."), and an RL step rewards generations that let a user provide calibrated answers to related questions. The calibrated Llama-2-7B is significantly more calibrated than strong finetuned factuality baselines at comparable accuracy, and generalizes under domain shift (scientific, biomedical, held-out biography generation).

The keeper is the decision-theoretic definition: calibration is not a property of token probabilities but of whether a reader ends up calibrated — which makes confidence statements first-class, trainable content rather than a post-hoc number.

This operationalizes, for long-form generation, the metacognitive third path the vault already names. Since Can models express uncertainty instead of just answering?, linguistic calibration is the training method that produces faithful uncertainty at paragraph scale, and it complements Can model confidence work as a reward signal for reasoning? (confidence-as-reward) with a user-decision-grounded reward.

Related concepts in this collection 3

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
13 direct connections · 128 in 2-hop network ·dense cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

linguistic calibration trains long-form generation with verbal confidence statements that let users make calibrated predictions