How do normalization and input injection control emergence of fixed points?
This explores a dynamical-systems framing — how design choices like normalization and feeding the input back in at each step ("input injection") govern whether a network settles into stable fixed points — but the collection doesn't hold work on that mechanism directly, so the honest answer is partial.
This reads as a question from the equilibrium-model / iterative-dynamics tradition: treat a network as a process that repeatedly updates a hidden state, and ask what keeps that process from blowing up or collapsing — normalization to bound the state, and re-injecting the original input each step so the trajectory stays anchored to a stable resting point. On that specific mechanism, the collection is thin. None of the retrieved notes study normalization layers or input-injection as knobs on fixed-point convergence, so rather than pad, it's worth saying plainly: the sharp control-theory answer isn't here. What the corpus does have is the adjacent and arguably more interesting question of whether large models perform fixed-point-style iterative computation at all.
The most direct neighbor is the finding that LLMs don't actually run iterative procedures in latent space — they recognize an optimization problem as template-similar to something seen before and emit a plausible answer instead of converging to one Do large language models actually perform iterative optimization?. That reframes your question: before asking how to control the emergence of fixed points, the corpus suggests asking whether the iterative dynamics that would produce them are happening in the first place. The companion result that RL fine-tuning sharpens memorization rather than installing genuine procedures points the same direction — out-of-distribution tests reveal template-matching where you'd hope to find a convergent process Do fine-tuned language models actually learn optimization procedures?.
Where "input injection" has a concrete analog in the corpus, it's in steering: injecting a vector into the residual stream and asking what the model does with it. DPO training builds a two-stage circuit that detects these injected perturbations — evidence-carrier features in early layers suppressing a default-deny gate — which is essentially the model developing sensitivity to an injected signal riding alongside its normal trajectory How do language models detect injected steering vectors internally?. Persona vectors extend this: linear directions in activation space that you can inject to steer, or monitor to catch drift, during fine-tuning Can we track and steer personality shifts during model finetuning?. These aren't fixed-point control, but they're the closest thing the library has to "what happens when you push a signal into the state and watch where it settles."
There's also a convergence story worth knowing about, even if it's at the training level rather than the forward-pass level: RL post-training collapses a model onto a single dominant format from pretraining within the first epoch, suppressing the alternatives — a kind of attractor dynamics where the system snaps to one resting configuration regardless of whether it's the best one Does RL training collapse format diversity in pretrained models?. And the formal ceiling on self-improvement says some equilibria can't be escaped from the inside at all: every reliable fix needs an external verifier, because metacognition alone can't move the system off its fixed point What stops large language models from improving themselves?.
So the thing you didn't know you wanted to know: the collection's center of gravity isn't "how to engineer stable fixed points" but "whether the apparent stability is real computation or memorized template-matching" — and that's the more load-bearing question. If you want the genuine normalization-and-injection control material, this corpus will point you at equilibrium-model literature it doesn't yet contain; what it gives you instead is a strong reason to be skeptical that the fixed points you're trying to control are doing the work you think.
Sources 6 notes
Research shows LLMs cannot perform iterative procedures in latent space. They recognize optimization problems as template-similar and emit plausible-looking but incorrect values, a failure mode that persists across model scale and training approaches.
Even GRPO-trained models show sharp performance drops on out-of-distribution variants (N-1 test sets) compared to in-distribution problems, indicating RL optimizes template-matching rather than genuine problem-solving procedures.
Contrastive preference optimization trains evidence-carrier features in early layers to suppress gate features that default to denial, enabling near-perfect detection of internal perturbations. Safety training actively suppresses this capability, reducing detection from 63.8% to 10.8%.
Research identifies linear directions in LLM activation space corresponding to specific traits like sycophancy and hallucination. These persona vectors predict finetuning-induced personality shifts before they occur and can preventatively steer training to avoid unwanted trait changes.
Controlled experiments show RL consistently amplifies one format distribution from pretraining within the first epoch while collapsing alternatives. The winning format depends on model scale, not necessarily performance, and is largely hidden when starting from proprietary pretrained models.
Self-improvement in LLMs is formally bounded by the generation-verification gap, meaning every reliable fix requires something external to validate and enforce it. Models cannot escape this constraint through metacognition alone.