Do language models make rational strategic decisions in games?
Explores whether LLMs consistently apply game-theoretic reasoning to reach optimal strategies, and whether their performance holds as games become more complex. Understanding this matters for deploying LLMs in negotiation and competitive settings.
Strategic decision-making — making choices that maximize expected utility given others' likely choices — is a demanding test of LLM rationality. Evaluating several frontier LLMs across complete-information games (Prisoner's Dilemma, Stag Hunt, etc.) and incomplete-information games (Deal-No-Deal), this work finds LLMs frequently deviate from rational strategies, and the deviation grows with game complexity (larger payoff matrices, deeper sequential trees). The fix is procedural: game-theoretic workflows that guide the model's reasoning and decision-making toward computing Nash equilibria. With the workflow, LLMs identify optimal strategies far better, reach near-optimal negotiation allocations, and become less exploitable.
The keeper is the same shape seen elsewhere: raw LLM rationality is unreliable and degrades with complexity, but an external reasoning scaffold that imposes the formal structure (here, game-theoretic computation) recovers it — capability is latent but needs the workflow to be reliably elicited.
This connects the vault's strategic-reasoning and workflow-scaffolding threads. It complements Do large language models use one reasoning style or many? (rationality isn't a uniform capability) and Why do standard dialogue systems fail at tracking negotiation agreement?, and the workflow-restores-capability pattern echoes Can LLMs actually forecast time series better than we think?.
Inquiring lines that use this note as a source 3
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
Related concepts in this collection 3
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Do large language models use one reasoning style or many?
Explores whether LLMs share a universal strategic reasoning approach or develop distinct styles tailored to specific game types. Understanding this matters for predicting model behavior in competitive versus cooperative scenarios.
both find strategic rationality is not a uniform general capability
-
Why do standard dialogue systems fail at tracking negotiation agreement?
Standard dialogue state tracking monitors one user's goals, but negotiation requires tracking both parties' evolving positions simultaneously. Why is this bilateral requirement fundamentally different, and what makes existing models insufficient?
the negotiation-state-tracking demand these workflows must satisfy
-
Can LLMs actually forecast time series better than we think?
Explores whether language models possess stronger forecasting ability than current benchmarks suggest, and what role workflow design plays in revealing or hiding that capability.
same workflow-restores-latent-capability pattern
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- Game-theoretic LLM: Agent Workflow for Negotiation Games
- Strategic Reasoning with Language Models
- LLM Strategic Reasoning: Agentic Study through Behavioral Game Theory
- Can Large Language Models Develop Strategic Reasoning? Post-training Insights from Learning Chess
- SDPO: Segment-Level Direct Preference Optimization for Social Agents
- Reinforced Language Models for Sequential Decision Making
- LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities
- The Decrypto Benchmark for Multi-Agent Reasoning and Theory of Mind
Original note title
LLMs deviate from rational game-theoretic strategies as complexity grows but structured workflows restore near-Nash rationality