What interaction patterns preserve human learning when AI provides domain answers?
This reads the question as: when AI hands you a finished answer, which conversational moves keep the human thinking and learning instead of passively absorbing — and the corpus answers it sideways, mostly by mapping how interaction goes wrong.
This explores what keeps a human cognitively active when an AI is doing the domain work for them. The collection doesn't have a paper titled "how to preserve learning," but it circles the territory from two directions: the failure modes that erode the learner, and the interaction designs that pull the human back into the loop. Read together, they suggest a single principle — learning survives when the exchange stays two-sided. The risk isn't the answer itself; it's the answer arriving so smoothly that the human stops doing any of the work. Why do people trust AI outputs they shouldn't? names the slide precisely: map-territory confusion, mistaking fluent output for reasoning, and confirmation bias don't just coexist — they compound, multiplying into "epistemic drift" where the user quietly outsources judgment.
The most striking framing comes from Does AI generate genuine utterances or just text patterns?, which argues the human is already doing more than they realize. AI output is "event-residue" — text carrying the markers of communication but missing the actual event of someone meaning it. The user supplies the missing orientation through interpretive labor, building a one-sided pseudo-exchange. The hopeful reading: that interpretive labor *is* learning. Interaction patterns preserve learning when they keep demanding it, and erode learning when they do the interpreting for you. Do humans and LLMs differ fundamentally or just superficially? sharpens this — inside a shared discourse, human and model draw on the same symbolic substrate, so the human's active participation in that discourse is what makes it real on their side.
The concrete answer the corpus offers is counterintuitive: the patterns that preserve learning are the ones where the AI asks rather than only tells. Why do language models respond passively instead of asking clarifying questions? shows why most systems fail at this — standard RLHF optimizes for immediate helpfulness, which trains models to dump an answer rather than ask a clarifying question, killing the multi-turn collaboration where a human actually reasons alongside the machine. When should AI agents ask users instead of just searching? borrows from conversation analysis to formalize *when* an agent should probe the user instead of silently chaining to a tool, and Can models learn to ask genuinely useful clarifying questions? shows that a good clarifying question (decomposed into clarity, relevance, specificity) measurably improves outcomes in domains like clinical reasoning — precisely because it forces the human to articulate, which is itself a learning act.
There's a genuine tension worth sitting with. Could proactive dialogue make conversations dramatically more efficient? celebrates AI that volunteers information without being asked, cutting conversation length by 60%. Efficiency and learning pull in opposite directions here: fewer turns mean less of the back-and-forth that keeps the human engaged. The reconciliation is that *which* turns get removed matters — eliminating friction is good, but eliminating the moments where the human has to think is the thing that quietly trades away learning for speed. Why don't conversational AI systems mirror their users' word choices? adds a subtler channel: human dialogue partners gradually adopt each other's vocabulary, building shared convention. A system that entrains *toward* the learner's language keeps them anchored in their own understanding rather than swapping it for the model's framing.
One boundary the corpus draws is worth taking with you: there's a hard limit to what any answer can transfer. Can prompt optimization teach models knowledge they lack? shows that prompting only reorganizes knowledge already present in the model — and the same ceiling applies to the human. An AI answer can activate what you already half-know, but the deep internalization of a domain (the kind Can reinforcement learning embed domain knowledge more effectively than supervised fine-tuning? describes for models, where rewarding explanation quality builds coherent internal structure rather than memorized tokens) seems to require the learner do the structuring work themselves. The thread that ties it all together: AI preserves human learning not when its answers are better, but when the interaction keeps the human explaining, articulating, and orienting — the same things that, in the model-training papers, are what actually build durable knowledge.
Sources 10 notes
Rose-Frame identifies map-territory confusion, intuition-reason conflation, and confirmation-bias reinforcement as traps that multiply their distorting effects when they co-occur. Evidence from cross-linguistic overreliance and architectural transformer biases confirms the compounding mechanism operates universally.
AI output carries communicative markers inherited from training data but lacks the event structure that produces actual utterances. Users supply the missing orientation through interpretive labor, creating a pseudo-event with structure only on the human side.
Applied Habermas's observer/participant distinction to AI: from outside, humans and LLMs are utterly different; from within shared discourse, both draw on the same symbolic substrate, making the difference structural rather than absolute.
CollabLLM demonstrates that standard RLHF training optimizes for immediate helpfulness, discouraging models from asking clarifying questions or offering multi-turn insights. Multi-turn-aware rewards that estimate long-term interaction value enable active intent discovery and genuine collaboration.
Tool-enabled LLMs drift from user intent through silent tool chaining. Conversation analysis reveals insert-expansions—clarifying intent, scoping responses, enhancing appeal—as a formal framework for proactive user consultation that prevents misunderstanding instead of recovering from it.
The ALFA framework breaks down question quality into theory-grounded attributes (clarity, relevance, specificity) and trains models on 80K attribute-specific preference pairs. Attribute-specific optimization outperforms single-score training, especially in clinical reasoning where asking the right clarifying question directly impacts decision quality.
Simulations show proactivity—providing relevant information without being asked—cuts dialogue turns by 60% in medium-complexity domains. This behavior mirrors human conversation and Grice's maxims but is almost entirely absent from AI datasets and research benchmarks.
Response generation models fail to adapt vocabulary toward users' lexical choices, a phenomenon central to human rapport and clarity. Post-training via DPO on coreference-identified preferences can teach models in-context convention formation.
Prompting works entirely within a model's pre-existing training distribution and cannot supply domain knowledge absent from training data. This creates a hard ceiling: no prompt strategy can compensate for missing foundational knowledge, only reorganize what already exists.
RLAG rewards both answer accuracy and explanation rationality by cycling between augmented and unaugmented generation, progressively internalizing coherent knowledge structures. This outperforms SFT because it prioritizes reasoning quality over token-level correctness.