Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics
Do large language models (LLMs) solve reasoning tasks by learning robust generalizable algorithms, or do they memorize training data? To investigate this question, we use arithmetic reasoning as a representative task. Using causal analysis, we identify a subset of the model (a circuit) that explains most of the model’s behavior for basic arithmetic logic and examine its functionality. By zooming in on the level of individual circuit neurons, we discover a sparse set of important neurons that implement simple heuristics. Each heuristic identifies a numerical input pattern and outputs corresponding answers. We hypothesize that the combination of these heuristic neurons is the mechanism used to produce correct arithmetic answers. To test this, we categorize each neuron into several heuristic types—such as neurons that activate when an operand falls within a certain range—and find that the unordered combination of these heuristic types is the mechanism that explains most of the model’s accuracy on arithmetic prompts. Finally, we demonstrate that this mechanism appears as the main source of arithmetic accuracy early in training.
Introduction. Do large language models (LLMs) implement robust reusable algorithms to solve tasks, or are they merely memorizing aspects of the training distribution? This distinction is crucial (T ̈anzer et al., 2022; Henighan et al., 2023): while memorization might suffice for limited problem sets, true algorithmic comprehension allows for generalization and efficient scaling to new problems. Arithmetic reasoning provides a lens for this investigation, as it can be solved using various methods: learning known algorithms, developing novel approaches, or by memorizing vast quantities of inputoutput pairs. Thus, we ask the following: Do LLMs implement robust algorithms to correctly complete arithmetic prompts, similar to children learning vertical addition to add two numbers, or do LLMs merely memorize the arithmetic prompts that appear in their vast training data? Previous studies have made progress in identifying arithmetic mechanisms in LLMs. Stolfo et al. (2023) and Zhang et al.
Discussion / Conclusion. Do LLMs rely on a robust algorithm or on memorization to solve arithmetic tasks? Our analysis suggests that the mechanism behind the arithmetic abilities of LLMs is somewhere in the middle: LLMs implement a bag of heuristics—a combination of many memorized rules—to perform arithmetic reasoning. To reach this conclusion, we performed a set of causal analysis experiments to locate a circuit, i.e., a subset of model components, responsible for arithmetic calculations. We examined the circuit at the level of individual neurons and pinpointed the arithmetic calculations to a sparse set of MLP neurons. We showed that each neuron acts as a memorized heuristic, activating for a specific pattern of inputs, and that the combination of many such neurons is required to correctly answer the prompts. In addition, we found that this mechanism gradually evolves over the course of training, emerging steadily rather than appearing abruptly or replacing other mechanisms.