Does question type determine the right retrieval strategy?
Explores whether different non-factoid question types require distinct retrieval and decomposition approaches. Matters because standard RAG fails when applied uniformly to debate, comparison, and experience questions despite being effective for factoid queries.
Standard RAG treats all queries as factoid: retrieve relevant documents, extract the answer. This is appropriate when there is a definitive answer. It is inappropriate for non-factoid questions (NFQs) that lack definitive answers and require synthesizing multiple perspectives, balancing competing viewpoints, or integrating personal experience.
Typed-RAG classifies NFQs into five types:
- Evidence-based: seeks definitions or characteristics of concepts. Single-aspect; standard retrieve-read works.
- Comparison: examines differences/similarities between targets. Multi-aspect; requires keyword extraction per comparison target, parallel retrieval, relevance-weighted aggregation.
- Experience: seeks advice from personal experience. Multi-aspect; requires experience-keyword extraction, similarity-based reranking, response aligned to question intent.
- Reason/Instruction: explains causes or procedures. Multi-aspect; decomposes into single-aspect sub-queries, individual retrieval and generation per sub-query, aggregation into structured response.
- Debate: explores multiple perspectives on a topic. Multi-aspect; extracts discussion topic and opposing opinions, generates per-opinion responses, debate mediator combines into balanced synthesis.
The key insight: question type determines whether aspects are contrasting (high contrast, opposing directions — debate, comparison) or related (lower contrast, aligned direction — experience, reason/instruction). Contrasting aspects require distinct retrieval per aspect. Related aspects allow shared retrieval with per-aspect filtering.
Without type classification, RAG systems apply the same strategy to all queries. Evidence-based questions succeed because they fit standard RAG. The other types fail — not because retrieval is poor but because the generation architecture does not match the question structure.
Researchy Questions adds that real-world non-factoid questions involve "unknown unknowns" — the questioner doesn't know what information is missing. Characteristic formats include relationship questions ("how does X affect Y"), causal questions ("why does X happen"), comparative questions (pros/cons), and analytical questions ("to what extent does X lead to Y"). A good non-factoid question "can lead to interesting and in-depth analysis" with a "clear and refutable thesis, supported by evidence and analysis." The 8-dimension scoring rubric (ambiguity, incompleteness, assumptions, multi-facetedness, knowledge-intensity, subjectivity, reasoning-intensity, harmfulness) can inform question type classification beyond simple topic categories. Source: Arxiv/Agentic Research.
Inquiring lines that use this note as a source 26
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- Why does long-form generation need different retrieval than factoid questions?
- How does uncertainty-gated retrieval compare to continuous retrieval efficiency?
- Why does retrieval quality sometimes conflict with final answer quality?
- What makes some clarifying questions more useful than others?
- Should retrieval be triggered always or only for difficult questions?
- How do real search queries reveal what counts as a deep research question?
- What makes web retrieval more effective than static knowledge bases?
- Which knowledge structure types best fit different query types?
- Can attribute-specific preference optimization improve question quality in information-seeking?
- Why does standard RAG succeed for evidence-based but fail for debate questions?
- What distinguishes contrasting aspects from related aspects in question structure?
- How do comparison and debate questions differ in their aspect retrieval needs?
- Can the eight-dimension rubric predict which question types need decomposition?
- Can question quality be trained separately from the decision to ask?
- What makes specific-facet questions outperform generic need-rephrasing requests?
- Why does single-round retrieval fail on multi-step tasks across different domains?
- When do queries fail to capture relevance patterns effectively?
- Why do question types determine retrieval and decomposition strategy in QA?
- Can retrieval strategies drive both draft refinement and new research question generation?
- Why do specific clarifying questions outperform rephrased versions of user needs?
- What makes a clarifying question aligned with user interests versus structurally sound?
- How does reflection-based query refinement differ from single-pass retrieval strategies?
- Can smaller scheme inventories or critical questions replace direct scheme classification?
- How well does semantic similarity preserve survey response nuance?
- Why do external feature triggers outperform uncertainty on complex questions?
- Which types of clarifying questions actually help users versus wasting their time?
Related concepts in this collection 3
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Does medical AI need knowledge or reasoning more?
Medical and mathematical domains may require fundamentally different AI training priorities. If medical accuracy depends primarily on factual knowledge while math depends on reasoning quality, should we build and evaluate these systems differently?
question type parallels domain type: different queries have different structural requirements, not just different content requirements
-
How do readers track segments, purposes, and salience together?
Can discourse processing actually happen in parallel rather than sequentially? This matters because understanding how readers coordinate multiple layers of meaning at once reveals where AI systems break down in comprehension.
non-factoid responses require tracking multiple discourse segments (one per aspect) with different purposes (describe vs argue vs advise)
-
How do logic units preserve procedural coherence better than chunks?
Can structured retrieval units with prerequisites, headers, bodies, and linkers maintain step-by-step coherence in how-to answers where fixed-size chunks fail? This matters because procedural questions require sequential logic and conditional branching that chunk-based RAG cannot support.
how-to questions are a specific NFQ type (reason/instruction) requiring procedural coherence: logic units' prerequisite-header-body-linker structure directly provides the sequential coherence that reason/instruction questions demand
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- Typed-RAG: Type-aware Multi-Aspect Decomposition for Non-Factoid Question Answering
- A Non-Factoid Question-Answering Taxonomy
- Researchy Questions: A Dataset of Multi-Perspective, Decompositional Questions for LLM Web Agents
- Divide-or-Conquer? Which Part Should You Distill Your LLM?
- Deep Research: A Systematic Survey
- SymAgent: A Neural-Symbolic Self-Learning Agent Framework for Complex Reasoning over Knowledge Graphs
- LongRAG: A Dual-Perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering
- Searching for Best Practices in Retrieval-Augmented Generation
Original note title
non-factoid question answering requires question type classification because type determines retrieval and decomposition strategy