Can structural causal models automate social science with language models?
Can we use structural causal models to let LLMs both propose and test social hypotheses systematically? This explores whether formal causal structure can overcome LLM limitations in social simulation.
This work automates the full cycle of social science — hypothesis generation and testing — with LLMs, and the key enabler is structural causal models (SCMs). The SCM is the connective tissue: it provides a language to state hypotheses, a blueprint for constructing LLM-based agent subjects, an experimental design, and a data-analysis plan; the fitted SCM then becomes an object for prediction or planning follow-on experiments. Across negotiation, bail hearing, job interview, and auction scenarios, the system both proposes and tests causal relationships, finding support for some and not others.
Two keepers. First, simulation elicits information not available through direct elicitation: asking the LLM to run the experiment surfaces structure that asking it the question directly does not. Second, a precise capability boundary — given its proposed SCM, the LLM predicts the signs of estimated effects well but cannot reliably predict their magnitudes. So in-silico social science is useful for direction, not effect size.
This is a strong fit for Adrian's social-simulation thread. It contrasts with the behaviorism critique of Can language models simulate belief change in people? — SCMs impose explicit causal structure rather than demographics-in-behavior-out — and shares the auditable-causal-structure move of Can we extract causal belief networks from interview conversations?.
Inquiring lines that use this note as a source 7
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- Can aggregate survey realism coexist with unreliable fine-grained effects?
- Why do LLMs reason fluently about causality but lack causal rigor?
- How can extracted causal belief networks enable intervention simulation?
- How does causal structure avoid behaviorist limitations in LLM social simulation?
- Can modular expert decomposition extend beyond time into other causal dimensions?
- What architectural changes would help LLMs distinguish causal relationships from temporal sequences?
- Do LLMs show stronger reasoning about causality than about temporal ordering?
Related concepts in this collection 3
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Can language models simulate belief change in people?
Current LLM social simulators treat behavior as input-output mappings without modeling internal belief formation or revision. Can they be redesigned to actually track how people think and change their minds?
SCM structure is a response to the behaviorist critique of LLM social simulation
-
Can we extract causal belief networks from interview conversations?
Can natural language interviews be systematically parsed into causal graphs that capture how individuals reason about policy trade-offs? This matters for building auditable belief simulations that go beyond static opinion snapshots.
shared auditable-causal-structure approach to simulating social/belief dynamics
-
Can separating causal models from language models improve reasoning?
Can an explicit formal causal model paired with an LLM translator overcome both spurious correlation reasoning and reward-without-explanation problems in RL? This explores whether dividing reasoning labor between systems addresses fundamental weaknesses in each.
same keep-causality-in-a-formal-substrate principle, applied to social science
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- Automated Social Science: Language Models as Scientist and Subjects
- Simulating Society Requires Simulating Thought
- Causal Reflection with Language Models
- Can Machines Think Like Humans? A Behavioral Evaluation of LLM-Agents in Dictator Games
- Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning
- Do Large Language Models Reason Causally Like Us? Even Better?
- Is this the real life? Is this just fantasy? The Misleading Success of Simulating Social Interactions With LLMs
- Large Causal Models From Large Language Models
Original note title
structural causal models let LLMs act as both scientist and subject to generate and test social hypotheses in silico