INQUIRING LINE

Inquiring lines›How do language models construct a…›How are AI-generated and human-wri…›Do autonomous architecture discove…›this inquiring line

AI can now out-design the architectures humans build for it — and the gap grows the more it searches.

Why do human-designed neural architectures eventually get replaced by learned ones?

This explores whether AI systems are starting to out-design the architectures humans hand-build — and why search, scaling, and learned structure keep eating away at the hand-crafted designs we started with.

This reads the question as being about a quiet shift in how neural architectures get made: less by human intuition, more by automated search and emergent structure. The corpus points to a few overlapping reasons. The most direct is that machines can now search the design space far more thoroughly than people. A multi-agent LLM system using genetic programming generated over a thousand novel architectures, and its best designs beat GPT-2 and Mamba-2 on most benchmarks Can AI systems discover better neural architectures than humans? — the lesson being that structured search, not raw human cleverness, was what pushed design success from 14% to nearly 100%. Human designers explore a tiny, biased slice of what's possible; learned search explores the rest.

There's a second, subtler reason: scale sometimes makes architectural cleverness unnecessary. Plain MLPs can achieve compositional generalization with no special architectural machinery, as long as the data covers enough of the task space Can neural networks learn compositional skills without symbolic mechanisms?. And networks left to train freely will grow their own internal structure — decomposing tasks into modular subnetworks without anyone designing those modules in Do neural networks naturally learn modular compositional structure?, and developing dense-versus-sparse representations purely from data familiarity Is representational sparsity learned or intrinsic to neural networks?. The structure we used to hand-engineer turns out to be something learning discovers on its own.

But the more interesting part of the corpus is where hand-designed architectures hit walls that learned ones don't. Fixed-depth transformers are stuck under a computational ceiling; a hierarchical recurrent model with just 27M parameters solves Sudoku and mazes that chain-of-thought transformers fail completely Can recurrent hierarchies achieve reasoning that transformers cannot?. Energy-based transformers replace the forward-pass prediction recipe with iterative energy minimization and get better scaling and out-of-distribution generalization without any domain-specific scaffolding Can energy minimization unlock reasoning without domain-specific training?. Even the workhorse transformer carries a baked-in flaw — soft attention structurally over-weights repeated and prominent tokens, a bias that feeds sycophancy before training ever corrects it Does transformer attention architecture inherently favor repeated content?. These aren't bugs to patch; they're consequences of choices a human made, and replacing the architecture is sometimes the only fix.

Here's the thing you might not have expected to care about: "replaced by learned ones" cuts both ways. A striking line of work shows that networks trained by gradient descent can reproduce identical outputs while harboring fractured, entangled internal representations — radically messier than evolved networks, and unable to transfer or recombine creatively Can identical outputs hide broken internal representations?. A model can pass every benchmark and still understand nothing coherent inside Can AI pass every test while understanding nothing?. So learned architectures win on performance and search efficiency, but they can win in ways we can't inspect — which is exactly why one strand of the corpus argues we need a human-parseable theory of deep learning regardless of how capable the systems become, because oversight depends on us being able to reason about the structures, not just measure their scores Can humans understand deep learning before AI does?. The replacement of human design by learned design isn't only a story of progress; it's a trade of legibility for capability.

Sources 10 notes

Can AI systems discover better neural architectures than humans?

Genesys, a multi-agent LLM system using genetic programming and a Ladder of Scales verification process, discovered 1,062 novel architectures, with top designs outperforming GPT-2 and Mamba-2 on 6 of 9 benchmarks. Structured GP representation proved critical, improving design success from 14% to nearly 100% versus direct LLM generation.

Can neural networks learn compositional skills without symbolic mechanisms?

Standard MLPs achieve compositional generalization through data and model scaling alone, without architectural modifications, provided the training distribution sufficiently covers combinations of task modules. Linear decodability of constituents from hidden activations reliably predicts success.

Do neural networks naturally learn modular compositional structure?

Pruning experiments reveal that neural networks implement compositional subroutines in isolated subnetworks, with ablations affecting only their corresponding function. Pretraining substantially increases the consistency and reliability of this modular structure across architectures and domains.

Is representational sparsity learned or intrinsic to neural networks?

During pretraining, neural networks develop dense activations for familiar training data and default to sparse representations for unfamiliar inputs. This trend emerges without task-specific fine-tuning and reflects how models consolidate knowledge through exposure.

Can recurrent hierarchies achieve reasoning that transformers cannot?

The Hierarchical Reasoning Model couples slow abstract planning with fast detailed computation across two timescales, achieving near-perfect performance on Sudoku and mazes where chain-of-thought methods fail completely. With only 27M parameters and 1,000 samples, HRM escapes the AC0/TC0 complexity ceiling that constrains fixed-depth transformers.

Show all 10 sources

Can energy minimization unlock reasoning without domain-specific training?

Energy-Based Transformers assign energy values to input-prediction pairs and use gradient descent minimization for inference, yielding 35% higher training scaling rates and 29% more inference-compute gains than Transformer++, while generalizing better on out-of-distribution data without domain-specific scaffolding.

Does transformer attention architecture inherently favor repeated content?

Transformer soft attention systematically over-weights repeated and context-prominent tokens regardless of relevance, creating a positive feedback loop that amplifies opinions and framing before RLHF acts. System 2 Attention—regenerating context to remove irrelevant material—can interrupt this mechanism.

Can identical outputs hide broken internal representations?

Networks trained with SGD reproduce outputs perfectly while having radically different internal structure than evolved networks, with weight perturbations revealing fractured, entangled representations that prevent transfer to novel contexts or creative recombination.

Can AI pass every test while understanding nothing?

The Fractured Entangled Representation hypothesis shows that SGD-trained networks can produce identical outputs across all inputs while maintaining radically different internal representations. Standard benchmarks cannot detect this structural difference.

Can humans understand deep learning before AI does?

Deep learning theory must be developed in forms humans can reason about and evaluate, because human oversight of AI systems depends on frameworks for identifying failure modes and validating explanations—not on whether AI can self-explain.

Papers this line draws on 8

The research behind the notes this line reads — ranked by how closely each paper relates.

Break It Down: Evidence for Structural Compositionality in Neural Networks3.53 match · arxiv ↗
Scaling can lead to compositional generalization2.66 match · arxiv ↗
Questioning Representational Optimism in Deep Learning: The Fractured Entangled Representation Hypothesis2.59 match · arxiv ↗
From Frege to chatGPT: Compositionality in language, cognition, and deep neural networks2.56 match · arxiv ↗
Hierarchical Reasoning Model2.54 match · arxiv ↗
Toward Transparent AI: A Survey on Interpreting the Inner Structures of Deep Neural Networks2.42 match · arxiv ↗
Open Problems in Mechanistic Interpretability2.42 match · arxiv ↗
Emergent Introspective Awareness in Large Language Models2.40 match · arxiv ↗

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a researcher tracking the shift from human-designed to learned neural architectures. The question: are human-designed architectures truly being *replaced*, or are we entering a hybrid regime where learned search + human oversight co-evolve?

What a curated library found — and when (dated claims, not current truth):
Findings span 2022–2026; treat these as perishable snapshots:
- Automated search (genetic programming + multi-agent LLMs) now explores architecture space far more thoroughly than humans, achieving ~100% design success vs. 14% by manual design (~2025).
- Scaling and data alone induce compositional generalization and modular subnetwork decomposition without explicit architectural priors; networks learn their own internal structure (~2025).
- Learned architectures overcome fixed constraints: hierarchical recurrent models solve Sudoku/mazes that fixed-depth transformers cannot; energy-based transformers achieve better OOD generalization than standard transformers (~2025).
- BUT: networks with identical outputs harbor fractured, entangled internal representations — messier and less interpretable than evolved designs, masking incoherent understanding despite benchmark success (~2025–2026).
- A human-parseable theory of deep learning remains essential for oversight, regardless of capability gains (~2026).

Anchor papers (verify; mind their dates):
- arXiv:2505.11581 (2025-05) — Fractured Entangled Representations
- arXiv:2507.02092 (2025-07) — Energy-Based Transformers
- arXiv:2604.21691 (2026-04) — Scientific Theory of Deep Learning
- arXiv:2511.13653 (2025-11) — Weight-Sparse Transformer Circuits

Your task:
(1) RE-TEST EACH CONSTRAINT. For learned vs. hand-designed: has the interpretability gap closed? Can we now inspect learned architectures as rigorously as we reason about human designs? Does the "fractured representation" finding still hold under recent scaling laws or new training regimes (DPO, mixture-of-experts, mechanistic interventions)? Separate the durable tension (learned = opaque) from what may be relaxed (e.g., sparse lottery tickets, circuit discovery, SAE-based decomposition).
(2) Surface the strongest work from the last 6 months arguing that human design + learned refinement (hybrid, not replacement) is the stable frontier, or that interpretability breakthroughs have made learned architectures legible.
(3) Propose 2 research questions assuming the regime may have moved: (a) Can curriculum or scaffolded search produce learned architectures that are *both* high-performing *and* interpretable? (b) If learned architectures are now dominant, what role remains for human intuition—architecture-level veto, initialization, loss design?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

AI can now out-design the architectures humans build for it — and the gap grows the more it searches.

Related lines of inquiry

Sources 10 notes

Papers this line draws on 8