SYNTHESIS NOTE
Reasoning, Retrieval, and Evaluation Language, Text, and Discourse

When do graph databases outperform vector embeddings for retrieval?

Vector similarity struggles with aggregate and relational queries that require traversing multiple entity connections. Can graph-oriented databases with deterministic queries solve this failure mode in enterprise domain applications?

Synthesis note · 2026-02-21 · sourced from Domain Specialization
How do you build domain expertise into general AI models? How should researchers navigate LLM reasoning research?

Vector similarity retrieval has a well-known failure mode that becomes critical in enterprise domain applications: aggregate and relational queries generate too many plausible candidates. The GODB paper illustrates this with a business case example: "give me the volume of cement or concrete sales lost due to humidity issues in 2023." A cosine similarity search on a database of half a million sales notes will return hundreds of candidate vectors — every note mentioning humidity, cement, or sales is a plausible match. The standard solution (take top-k) is not suitable: the answer requires aggregating across all relevant records, not selecting the single most relevant one.

Graph-oriented databases (GODBs) solve this by replacing similarity search with graph traversal. Knowledge is stored as entities and labeled relationships (LLM-generated from source text). Queries are expressed in graph query languages (Cypher for Neo4j) that can precisely specify traversal paths: find all records where cement-sales-loss is connected to humidity-cause in 2023, sum across all matching nodes. The query is deterministic and complete rather than probabilistic and sampled.

The production architecture: (1) LLM extracts entities and relationships from domain documents and constructs the knowledge graph; (2) user queries are translated to Cypher expressions by an LLM agent; (3) graph database executes the traversal and returns structured results; (4) LLM interprets and synthesizes the results into natural language responses. The LLM's role shifts from primary retrieval to query translation and result interpretation — tasks where its language capabilities are well-suited.

The limitation: constructing and maintaining a knowledge graph from domain documents is significantly more expensive than building vector embeddings. The GODB approach scales when the query patterns are relational and the cost of incorrect answers is high — the enterprise domain use case. For simple semantic lookup (find me a document about X), vector embeddings are faster and cheaper.

The LLM+KG integration landscape: A comprehensive survey identifies three integration paradigms: (1) KG-enhanced LLMs — using KG structure to improve LLM reasoning (entity embeddings, structured pretraining); (2) LLM-augmented KGs — using LLMs for KG construction, completion, and question answering; and (3) Synergized LLM+KG — bidirectional collaboration where each improves the other. The GODB approach falls in paradigm (2); HippoRAG and GraphRAG represent paradigm (3). This taxonomy clarifies that "graph vs vector" is not a binary choice but a design space with distinct integration patterns suited to different query types and domain requirements.

A second and distinct failure mode compounds the relational problem: vector embeddings measure semantic co-occurrence, not task relevance. The king/queen/ruler example (OpenAI ADA-002): queen scores 92% similarity to king; ruler scores 83%. Yet for a query about "information about kings in governance," ruler is the more relevant result — kings and rulers are synonyms, while kings and queens are related but play different roles. Embeddings cannot distinguish these because they are trained on co-occurrence, not relevance. This failure occurs even on simple single-hop queries, predating the aggregate/relational failure that GODB addresses. See Do vector embeddings actually measure task relevance?.

This connects to Can organizing knowledge structures beat raw training data volume?: both findings point to structured knowledge organization as a competitive advantage over unstructured volume. In injection, taxonomy structure improves efficiency. In retrieval, graph structure enables query types that vector search cannot support.

Inquiring lines that use this note as a source 25

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 6

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
14 direct connections · 92 in 2-hop network ·medium cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

graph-oriented databases outperform vector embeddings for domain rag when queries require relational traversal