Does question type determine the right retrieval strategy?

Explores whether different non-factoid question types require distinct retrieval and decomposition approaches. Matters because standard RAG fails when applied uniformly to debate, comparison, and experience questions despite being effective for factoid queries.

Synthesis note · 2026-02-22 · sourced from RAG

Standard RAG treats all queries as factoid: retrieve relevant documents, extract the answer. This is appropriate when there is a definitive answer. It is inappropriate for non-factoid questions (NFQs) that lack definitive answers and require synthesizing multiple perspectives, balancing competing viewpoints, or integrating personal experience.

Typed-RAG classifies NFQs into five types:

Evidence-based: seeks definitions or characteristics of concepts. Single-aspect; standard retrieve-read works.
Comparison: examines differences/similarities between targets. Multi-aspect; requires keyword extraction per comparison target, parallel retrieval, relevance-weighted aggregation.
Experience: seeks advice from personal experience. Multi-aspect; requires experience-keyword extraction, similarity-based reranking, response aligned to question intent.
Reason/Instruction: explains causes or procedures. Multi-aspect; decomposes into single-aspect sub-queries, individual retrieval and generation per sub-query, aggregation into structured response.
Debate: explores multiple perspectives on a topic. Multi-aspect; extracts discussion topic and opposing opinions, generates per-opinion responses, debate mediator combines into balanced synthesis.

The key insight: question type determines whether aspects are contrasting (high contrast, opposing directions — debate, comparison) or related (lower contrast, aligned direction — experience, reason/instruction). Contrasting aspects require distinct retrieval per aspect. Related aspects allow shared retrieval with per-aspect filtering.

Without type classification, RAG systems apply the same strategy to all queries. Evidence-based questions succeed because they fit standard RAG. The other types fail — not because retrieval is poor but because the generation architecture does not match the question structure.

Researchy Questions adds that real-world non-factoid questions involve "unknown unknowns" — the questioner doesn't know what information is missing. Characteristic formats include relationship questions ("how does X affect Y"), causal questions ("why does X happen"), comparative questions (pros/cons), and analytical questions ("to what extent does X lead to Y"). A good non-factoid question "can lead to interesting and in-depth analysis" with a "clear and refutable thesis, supported by evidence and analysis." The 8-dimension scoring rubric (ambiguity, incompleteness, assumptions, multi-facetedness, knowledge-intensity, subjectivity, reasoning-intensity, harmfulness) can inform question type classification beyond simple topic categories. Source: Arxiv/Agentic Research.

Inquiring lines that read this note 27

This note is a source for these research framings, grouped by the broader line of inquiry each explores. Scan the bold lines of inquiry; follow any specific question forward.

How should retrieval systems optimize for multi-step reasoning during inference?

When should retrieval-augmented systems decide to fetch new information?

What makes specific clarifying questions more effective than generic ones?

How should iterative research systems allocate reasoning per search step?

How do knowledge graphs enable efficient multi-hop reasoning over alternatives?

Which knowledge structure types best fit different query types?

How should dialogue systems best leverage conversation history for retrieval?

How do comparison and debate questions differ in their aspect retrieval needs?

Why do semantic similarity and task relevance diverge in vector embeddings?

How well does semantic similarity preserve survey response nuance?

Related concepts in this collection 3

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map

15 direct connections · 147 in 2-hop network ·dense cluster Open in graph ↗

Does question type determine the right retrieval… Does medical AI need knowledge or reasoning more? How do readers track segments, purposes, and salie… How do logic units preserve procedural coherence b…

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Does medical AI need knowledge or reasoning more? Medical and mathematical domains may require fundamentally different AI training priorities. If medical accuracy depends primarily on factual knowledge while math depends on reasoning quality, should we build and evaluate these systems differently?
question type parallels domain type: different queries have different structural requirements, not just different content requirements
How do readers track segments, purposes, and salience together? Can discourse processing actually happen in parallel rather than sequentially? This matters because understanding how readers coordinate multiple layers of meaning at once reveals where AI systems break down in comprehension.
non-factoid responses require tracking multiple discourse segments (one per aspect) with different purposes (describe vs argue vs advise)
How do logic units preserve procedural coherence better than chunks? Can structured retrieval units with prerequisites, headers, bodies, and linkers maintain step-by-step coherence in how-to answers where fixed-size chunks fail? This matters because procedural questions require sequential logic and conditional branching that chunk-based RAG cannot support.
how-to questions are a specific NFQ type (reason/instruction) requiring procedural coherence: logic units' prerequisite-header-body-linker structure directly provides the sequential coherence that reason/instruction questions demand

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

non-factoid question answering requires question type classification because type determines retrieval and decomposition strategy

Does question type determine the right retrieval strategy?

Inquiring lines that read this note 27

Related concepts in this collection 3

Related papers in this collection 8

Search by related questions 4