Can a single transformer become universally programmable through prompts?
Explores whether prompts can function as genuine programs that unlock universal computation in fixed-size models, and whether this theoretical possibility translates to practical training outcomes.
"Ask, and it shall be given: Turing completeness of prompting" (2024) proves that there exists a finite-size Transformer such that for any computable function, there exists a corresponding prompt following which the Transformer computes the function. Furthermore, this single finite-size Transformer achieves nearly the same complexity bounds as the class of all unbounded-size Transformers.
The result establishes a theoretical underpinning for prompt engineering: prompts are not merely heuristic nudges that help a model do what it already can — they are, in principle, the mechanism that makes a fixed model universally programmable. The prompt IS the program.
However, the gap between expressiveness and learnability is critical. The proof shows the existence of such a Transformer but does not imply that standard training produces models that learn to implement arbitrary programs through CoT steps. This mirrors the broader pattern: since Can prompt optimization teach models knowledge they lack?, the practical limitation is not what prompts CAN express but what models HAVE learned to respond to.
The result also reframes the "prompts as programs" analogy used by several papers in this space. Promptbreeder treats prompts as self-modifiable programs. APE treats prompt search as program synthesis. The Turing completeness result validates these analogies — prompts genuinely are programs in the formal sense, not just metaphorically. But the practical implication is bounded by the model's training: the space of prompts that a trained model responds to meaningfully is a tiny subset of the theoretically expressible space.
Inquiring lines that use this note as a source 47
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- Do transformers learn generalizable algorithms or instance-based patterns?
- Can universal function approximators be expensive to learn in practice?
- Can prompting inject new knowledge into already-trained AI models?
- Can symbolic mechanisms improve transformer compositional abilities?
- Can dynamic instance-specific prompt selection solve the generalization problem across tasks?
- What computational role do intermediate tokens actually play in transformers?
- How does circuit complexity limit which grammatical structures transformers can acquire?
- How does prompt context activation differ from parameter-based knowledge injection?
- Why does joint optimization of prompts and inference strategy outperform separate tuning?
- Can adaptive compute distribution across prompts replace the need for sophisticated reasoning frameworks?
- Why do standard transformers fail on problems requiring serial algorithmic reasoning?
- Can prompt optimization inject genuinely new knowledge into a model?
- Do decoder-only models have inherent architectural limits for non-sequential information?
- What hidden computations happen inside transformer layers during reasoning?
- Can prompting-only specialization hide domain boundaries from users?
- Can any architecture fundamentally solve problems that require inherently sequential computation?
- How much of prompt sensitivity is really just frequency optimization in disguise?
- What formal language complexity level matches transformer computational limits best?
- Can compute-optimal scaling work without co-optimizing the prompt itself?
- How do language agents implement prompts as executable computational graphs?
- What knowledge can prompt optimization actually activate in trained models?
- Can algorithmic control flow over prompts simulate traditional programming languages?
- Why do standard transformers fail to encode recursive structure in their hidden states?
- Can transformers reason beyond fixed architectural depth limits?
- Does Promptbreeder actually escape the generation-verification gap constraints?
- Can bounded-depth transformers solve inherently sequential problems?
- Why does weight space search reduce robustness to prompt perturbations better than prompt engineering?
- Is prompt engineering a workaround rather than a capability fix?
- Can intellectual property law apply to unfixed, context-dependent outputs?
- What explains the contextual variability of knowledge in transformers?
- How do prompting and activation steering relate as compression strategies?
- How does decomposed prompting formalize prompt libraries as reusable software modules?
- Can sub-task handlers be swapped between neural and symbolic systems?
- What makes protocols better than free-form prompting for tool coordination?
- Why does sandboxed execution matter more than monolithic prompting?
- Can deterministic computation actually create new information in data?
- How do external prompt artifacts improve agent behavior compared to inline instructions?
- Can energy-based transformers achieve deep reasoning without supervision?
- Why does prompting discover capabilities that need reward-driven refinement?
- Can trained models encode programs more complex than their data-generating process?
- Why does prompt optimization alone fail to inject genuinely new knowledge?
- Does joint optimization of prompts and parameters outperform separate tuning?
- Does the Chinchilla balance apply equally across all data types or only language?
- Can recurrent transformers learn genuinely new computations beyond inference stages?
- Can decoding-time prompting strategies fully replace diversity-focused training methods?
- How does scaling and training data enable compositional behavior without symbolic mechanisms?
- Why does reapplying the same transformer block work better than computing new layers?
Related concepts in this collection 3
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Can prompt optimization teach models knowledge they lack?
Explores whether sophisticated prompting techniques can inject new domain knowledge into language models, or if they're limited to activating existing training knowledge.
practical ceiling on the Turing completeness result: the model must have learned the relevant patterns
-
Can we automatically optimize both prompts and agent coordination?
This explores whether language agents can be represented as computational graphs whose structure and content adapt automatically. Why it matters: current agent systems require hand-engineered orchestration; automatic optimization could unlock more capable multi-agent systems.
agents as computational graphs is the practical instantiation of prompts-as-programs
-
Can algorithms control LLM reasoning better than LLMs alone?
Explores whether embedding LLMs within algorithmic control flow—where programs manage state and context filtering—enables complex task decomposition beyond what LLMs achieve through self-managed reasoning chains.
algorithmic control flow over prompts as practical programming
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- Ask, and it shall be given: Turing completeness of prompting
- Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization
- Between Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic Biases
- Large Language Model Programs
- Performative Thinking? The Brittle Correlation Between CoT Length and Problem Complexity
- Hierarchical Reasoning Model
- Faith and Fate: Limits of Transformers on Compositionality
- TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
Original note title
prompting is Turing complete — a single finite-size transformer can compute any computable function given the right prompt