Can fixed points replace learned halt tokens in reasoning models?
Does stopping inference when a looped transformer's internal state stabilizes provide a better halting signal than training a dedicated token predictor? This matters for building adaptive compute without expensive special training.