Why do large language models produce generic responses to vague queries?

When users fail to specify contextual details in prompts, do LLMs collapse multiple training contexts into a single generic response? Understanding this failure mode could improve how we scaffold user-model interaction.

Synthesis note · 2026-05-01 · sourced from Conversation Topics Dialog

Context collapse as introduced by Meyrowitz and elaborated by danah boyd describes how electronic media merge previously separated audiences into a single communicative context, forcing speakers to adopt one register that satisfies none. Stokely Carmichael's Black-audience rhetoric became universally audible once broadcast to TV and radio, and he had to choose. The same dynamic appears on social media: posts persist, replicate, and reach audiences the speaker never intended.

Kasirzadeh and Gabriel argue that LLM conversation produces a different form of context collapse. The collapse is not from audience merging — there is one user — but from inadequate scaffolding plus model defaulting. When a user asks for advice on a "work conflict" without specifying their industry, the model cannot infer situational boundaries, so it blends training-data priors from corporate, academic, and gig-economy contexts into a single generic response. The collapse happens between the contexts the model was trained on, not between the user's actual audiences.

This distinction matters because it locates the failure differently. Social-media context collapse is a property of the platform and its visibility settings. LLM context collapse is a property of the user-model interface: the user's mistaken expectation that the model possesses human-like pragmatic capacities to infer situation, plus the model's training-data-driven default when those expectations are not met. Mitigations differ accordingly. Social-media remedies focus on audience controls; LLM remedies focus on context verification, query-back protocols, and user-driven scaffolding tools.

Inquiring lines that read this note 28

This note is a source for these research framings, grouped by the broader line of inquiry each explores. Scan the bold lines of inquiry; follow any specific question forward.

How do training priors constrain what context information can override?

Why do language models reinforce false assumptions instead of correcting them?

Do language models understand semantics or rely on pattern matching?

Why do language models struggle with implicit discourse relations?

How do fixed pragmatic templates prevent models from understanding context?

What critical LLM failures do standard benchmarks hide?

How should retrieval systems optimize for multi-step reasoning during inference?

When does optimizing for quality undermine the value of diversity?

How does tokenization toward corpus mean affect downstream output diversity?

Can prompting inject entirely new knowledge into language models?

Can prompting strategies overcome LLM biases without model fine-tuning?

Can we predict when a specific prompt will fail on a given question?

How can models identify insufficient information and respond appropriately without guessing?

When should retrieval-augmented systems decide to fetch new information?

Can context windows and RAG actually change what language models generate?

What makes specific clarifying questions more effective than generic ones?

What prevents language models from reliably adopting diverse personas?

Why do language models prefer certain response styles regardless of what the prompt asks?

How does memorization interact with learning and generalization?

Why does training data not function as a searchable corpus?

Why do large language models produce generic responses to vague queries?

Inquiring lines that read this note 28

Related papers in this collection 8

Search by related questions 4