SYNTHESIS NOTE
Model Architecture and Internals

Can deep learning theory unify around training dynamics?

Is learning mechanics—focused on average-case predictions and training dynamics rather than worst-case bounds—the emerging framework that finally unifies fragmented deep learning theory?

Synthesis note · 2026-05-18 · sourced from Foundation Models

Deep learning is the most powerful and most inscrutable member of the machine learning pantheon. Decades of attempts to put rigorous theoretical backing behind it have produced fragments — solvable toy models, scaling laws, hyperparameter limits, universal behaviors — but no unified frame. The argument in There Will Be a Scientific Theory of Deep Learning is that these fragments are not isolated; they are converging into a single discipline that the authors call learning mechanics.

Five strands point at the unification: (1) solvable idealized settings provide intuition for realistic systems, (2) tractable limits reveal fundamental phenomena, (3) simple mathematical laws capture macroscopic observables, (4) hyperparameter theories disentangle which parameters drive behavior, and (5) universal behaviors across systems clarify which phenomena need explanation. Each of these mirrors a move that classical, continuum, statistical, or quantum mechanics made for physical systems. The analogy is structural, not rhetorical: both fields develop libraries of solvable settings, both work with aggregate statistics rather than per-particle motion, both treat system parameters as first-class objects, and both encounter universality across regimes.

The methodological consequence is sharp. Learning mechanics aims at average-case predictions over rigorous worst-case bounds. This is a distinct epistemic project from learning theory's PAC-style guarantees and from interpretability's per-circuit causal accounts. It is concerned with what happens during training, with dynamics rather than endpoints, and with phenomena that are robust across architecture and dataset choices.

The paper anticipates a complementary relationship with mechanistic interpretability — "where mechanistic interpretability aims to be the biology of deep learning, learning mechanics should aspire to be its physics." Mech interp dissects specific circuits in specific models; learning mechanics characterizes the dynamics any sufficiently large network exhibits during training. Both are necessary; neither is sufficient alone.

Inquiring lines that use this note as a source 8

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 4

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
15 direct connections · 127 in 2-hop network ·medium cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

learning mechanics is the emerging unifying frame for deep learning theory — concerned with training dynamics and average-case predictions not worst-case bounds