Do hierarchical retrieval architectures outperform flat ones on complex queries?

Inquiring lines that read this note 92

This note is a source for these research framings, grouped by the broader line of inquiry each explores. Scan the bold lines of inquiry; follow any specific question forward.

Can model routing outperform monolithic scaling as an efficiency strategy?

What makes specific clarifying questions more effective than generic ones?

How should iterative research systems allocate reasoning per search step?

Can retrieval improve multi-step reasoning by triggering at each uncertainty?
How does semantic search over research papers guide autonomous architecture proposals?
Do single-step retrieval systems with sophisticated synthesis qualify as deep research?
How do real search queries reveal what counts as a deep research question?
How does query planning as a separate step improve multi-hop retrieval coherence?
Can step-level rewards improve training of agentic retrieval systems?
How do cascaded probabilistic models compare to reinforcement learning for per-query system design?
Can retrieval strategies drive both draft refinement and new research question generation?
Can generator feedback backpropagate through the entire retrieval pipeline?
How does proactive information-gathering capability differ from passive knowledge retrieval?
How does reflection-based query refinement differ from single-pass retrieval strategies?
Do expansion-reflection loops and chain-of-retrieval approaches solve the same problem?
Why do deep research agents outperform retrieval augmented generation systems?
What distinguishes iterative query refinement from pure self-revision loops?
Can stateless multi-step retrieval capture evidence integration as well as dynamic memory?
Why do per-turn reasoning caps improve iterative search quality?

When should retrieval-augmented systems decide to fetch new information?

How should inference compute be adaptively allocated based on prompt difficulty?

How should we allocate compute between reasoning and retrieval iterations?

How should retrieval systems optimize for multi-step reasoning during inference?

Why do semantic similarity and task relevance diverge in vector embeddings?

How do knowledge graphs enable efficient multi-hop reasoning over alternatives?

How should dialogue systems best leverage conversation history for retrieval?

Does decoupling planning from execution improve multi-step reasoning accuracy?

How do knowledge injection methods compare across cost and effectiveness?

How should query augmentation strategies be properly evaluated against baselines?

Can inference-time compute substitute for scaling up model parameters?

How does test-time search budget efficiency benefit from hierarchical architectures?

Can graph structure and relationships fundamentally improve recommendation systems?

Why does Personalized PageRank naturally discover concepts multiple hops from query seeds?

How do neural networks separate factual knowledge from reasoning abilities?

How do retrieval heads interact with layer-level separation of knowledge and reasoning?

Does parallel reasoning outperform sequential thinking under fixed compute budgets?

Can language model RL training avoid reward hacking and misalignment?

Can separating token weighting from query filtering reduce reward hacking?

Does model scaling alone produce compositional generalization without symbolic mechanisms?

How does sequence length affect sparsity tolerance in models?

Can sparse attention methods be designed specifically for multi-hop reasoning tasks?

Does recurrence enable reasoning capabilities that fixed-depth transformers cannot achieve?

How do transformers compare to state-space models on copying and retrieval?

Related concepts in this collection 2

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map

18 direct connections · 167 in 2-hop network ·dense cluster Open in graph ↗

Do hierarchical retrieval architectures outperfo… Does search budget scale like reasoning tokens for… How do readers track segments, purposes, and salie…

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Does search budget scale like reasoning tokens for answer quality? Explores whether the test-time scaling law that applies to reasoning tokens also governs search-based retrieval in agentic systems. Understanding this relationship could reshape how we allocate inference compute between thinking and searching.
extends: hierarchical architecture makes the search budget more efficient by reducing interference loss
How do readers track segments, purposes, and salience together? Can discourse processing actually happen in parallel rather than sequentially? This matters because understanding how readers coordinate multiple layers of meaning at once reveals where AI systems break down in comprehension.
connects: HierSearch solves at system architecture level the same parallel-tracking problem that discourse processing requires at the cognitive level

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Chain-of-Retrieval Augmented Generation0.83 match · arxiv ↗
RAG-R1 : Incentivize the Search and Reasoning Capabilities of LLMs through Multi-query Parallelism0.83 match · arxiv ↗
You Don't Need Pre-built Graphs for RAG: Retrieval Augmented Generation with Adaptive Reasoning Structures0.82 match · arxiv ↗
Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation0.82 match · arxiv ↗
Deep Research: A Systematic Survey0.81 match · arxiv ↗
Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs0.81 match · arxiv ↗
Hybrid LLM: Cost-Efficient and Quality-Aware Query Routing0.80 match · arxiv ↗
GrepSeek: Training Search Agents for Direct Corpus Interaction0.80 match · arxiv ↗

Search by related questions 4

Suggested questions this note speaks to — click to search the collection, or type your own.