Automated Social Science: Language Models as Scientist and Subjects

Paper · arXiv 2404.11794 · Published April 17, 2024

We present an approach for automatically generating and testing, in silico, social scientific hypotheses. This automation is made possible by recent advances in large language models (LLM), but the key feature of the approach is the use of structural causal models. Structural causal models provide a language to state hypotheses, a blueprint for constructing LLM-based agents, an experimental design, and a plan for data analysis. The fitted structural causal model becomes an object available for prediction or the planning of follow-on experiments. We demonstrate the approach with several scenarios: a negotiation, a bail hearing, a job interview, and an auction. In each case, causal relationships are both proposed and tested by the system, finding evidence for some and not others. We provide evidence that the insights from these simulations of social interactions are not available to the LLM purely through direct elicitation. When given its proposed structural causal model for each scenario, the LLM is good at predicting the signs of estimated effects, but it cannot reliably predict the magnitudes of those estimates.

Introduction. There is much work on efficiently estimating econometric models of human behavior but comparatively little work on efficiently generating and testing those models to estimate. Previously, developing such models and hypotheses to test was exclusively a human task. This is changing as researchers have begun to explore automated hypothesis generation through the use of machine learning.1 But even with novel machine-generated hypotheses, there is still the problem of testing. A potential solution is simulation. Researchers have shown that Large Language Models (LLM) can simulate humans as experimental subjects with surprising degrees of realism.2 To the extent that these simulation results carry over to human subjects in out-ofsample tasks, they provide another option for testing (Horton, 2023). In this paper, we combine these ideas—automated hypothesis generation and automated in silico hypothesis testing—by using LLMs for both purposes. We demonstrate that such automation is possible.

Discussion / Conclusion. This paper demonstrates an approach to automated in silico hypothesis generation and testing made possible through the use of SCMs. We implemented the approach by building a computational system with LLMs and provided evidence that simulations can elicit information from an LLM that was not ex-ante available to the model. We also showed that such simulations produce results that are highly consistent with theoretical predictions made by the relevant economic theory. In this final section, we will discuss why such systems could be useful and identify areas for future research. How might systems like the one presented in this paper be useful for social science research? One view is that these simulations are simple dress rehearsals for “real” social science. A more expansive and exciting view is that these simulations would yield insights that sometimes generalize to the real world.

Automated Social Science: Language Models as Scientist and Subjects

Synthesis notes that discuss concepts related to this paper