Ask, and it shall be given: Turing completeness of prompting
Since the success of GPT, large language models (LLMs) have been revolutionizing machine learning and have initiated the so-called LLM prompting paradigm. In the era of LLMs, people train a single general-purpose LLM and provide the LLM with different prompts to perform different tasks. However, such empirical success largely lacks theoretical understanding. Here, we present the first theoretical study on the LLM prompting paradigm to the best of our knowledge. In this work, we show that prompting is in fact Turing-complete: there exists a finite-size Transformer such that for any computable function, there exists a corresponding prompt following which the Transformer computes the function. Furthermore, we show that even though we use only a single finite-size Transformer, it can still achieve nearly the same complexity bounds as that of the class of all unboundedsize Transformers. Overall, our result reveals that prompting can enable a single finite-size Transformer to be efficiently universal, which establishes a theoretical underpinning for prompt engineering in practice.
Introduction. The mainstream architecture of large language models (LLMs; e.g., OpenAI, 2024; Anthropic, 2024; Meta, 2024; Google, 2024) is Transformers (Vaswani et al., 2017). There has been a series of theoretical studies on Transformers under realistic abstractions (P ́erez et al., 2019; Bhattamishra et al., 2020; Hahn, 2020; P ́erez et al., 2021; Hao et al., 2022; Liu et al., 2023a; Chiang et al., 2023; Merrill & Sabharwal, 2023; Roberts, 2023; Merrill & Sabharwal, 2024a;b; Hou et al., 2024; Li et al., 2024). For example, P ́erez et al. (2021) have shown that the class of all Transformers with hardmax attention is Turing-complete: for any computable function φ ∈ TIME1(t(n)), there exists a Transformer that computes φ using O(t(n)) chain-of-thought (CoT; Wei et al., 2022b) steps and O(log(n + t(n))) precision on length-n inputs; Merrill & Sabharwal (2024a) have later improved the CoT complexity to O(t(n)) for TIME(t(n)) functions. These works have finely characterized the capacities and limits of Transformers under the classic one-model-onetask paradigm.
Discussion / Conclusion. In this work, we have shown that prompting is in fact Turing-complete: there exists a finite-size Transformer such that for any computable function, there exists a corresponding prompt following which the Transformer computes the function. Furthermore, we have shown that even though we use only a single finite-size Transformer, it can still achieve nearly the same complexity bounds as that of the class of all unbounded-size Transformers. Overall, our result reveals that prompting can enable a single finite-size Transformer to be efficiently universal, which establishes a theoretical underpinning for prompt engineering in practice. Our current work focuses on expressive power rather than learnability. While we have shown the existence of a Transformer on which prompting is Turing-complete, it does not necessarily imply that a Transformer effectively learns to simulate any 2-PTMs through CoT steps. Investigating the learnability of such a Transformer is an intriguing direction for future research.