SYNTHESIS NOTE
Training, RL, and Test-Time Scaling Reasoning, Retrieval, and Evaluation Model Architecture and Internals

Can confidence patterns reveal overthinking versus underthinking?

This explores whether real-time confidence signals can diagnose when a reasoning model is trapped in redundant deliberation versus committing prematurely, and whether steering based on these signals can balance both failure modes.

Synthesis note · 2026-04-01 · sourced from Reasoning by Reflection
When does thinking too much actually hurt reasoning?

Overthinking and underthinking are dual failures, and existing methods that suppress one often induce the other. Suppressing reflective keywords or truncating reasoning length reduces overthinking but causes underthinking — the model doesn't explore enough. Forcing longer chains reduces underthinking but generates redundancy. ReBalance resolves this by treating confidence as a continuous diagnostic signal rather than using binary interventions.

The diagnostic: Confidence values correlate with reasoning behavior in interpretable ways:

The mechanism: From a small-scale dataset, identify reasoning steps indicating each mode. Aggregate their hidden states into reasoning mode prototypes. Compute a steering vector encoding the transition from overthinking to underthinking. A dynamic control function modulates the vector's strength and direction based on real-time confidence: pruning redundancy during overthinking, promoting exploration during underthinking.

Why it's training-free: The steering vector captures the model's inherent reasoning dynamics — it's extracted from the model's own hidden states, not trained. Because it operates on intrinsic representations, it generalizes across unseen data and tasks (math, QA, coding). This makes it plug-and-play across models from 0.5B to 32B.

Since Can we steer reasoning toward brevity without retraining?, ReBalance extends the activation-steering approach from length compression to reasoning quality management. ASC steers between verbose and concise modes; ReBalance steers between overthinking and underthinking — a qualitative distinction, not just quantitative.

Since Does more thinking time always improve reasoning accuracy?, ReBalance provides the dynamic mechanism the threshold finding calls for: instead of a fixed cutoff, confidence-based steering continuously adjusts the reasoning trajectory.

Inquiring lines that use this note as a source 73

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 4

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
12 direct connections · 108 in 2-hop network ·dense cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

ReBalance uses confidence as continuous indicator to dynamically steer between overthinking and underthinking — training-free balanced reasoning via hidden state steering vectors