INQUIRING LINE

Inquiring lines›What makes reasoning better — more…›Why do models show mismatched conf…›Is embodied interaction necessary…›this inquiring line

What defines the edge of a mind may not be skin or hardware — it may be how tightly its parts work together.

Does functional integration determine cognitive system boundaries?

This explores a foundational question in cognitive science — whether the boundary of a cognitive system is set by how tightly its parts are functionally coupled, rather than by skin, skull, or hardware — and asks what the corpus says about where one cognitive system ends and another begins.

This explores whether functional integration — how tightly components are coupled into a working whole — is what actually draws the line around a cognitive system, rather than physical containers like a brain or a chip. The corpus doesn't argue this philosophically, but several notes converge on it from different angles, and the most direct evidence comes from neuroscience: the distinction between formal and functional linguistic competence rests on exactly this claim. Next-token prediction can master grammar and form, but genuine functional understanding requires *integration* across diverse brain networks that the prediction objective never recruits Are language models developing real functional competence or just formal competence?. In other words, competence isn't located in one module — it emerges only when separate systems are wired together. Integration is the thing that makes the capacity exist at all.

If integration constitutes cognition, then boundaries should be able to extend past the skull — and the corpus has a sharp result here. RL agents, given no memory objective, spontaneously start using the spatial environment itself as external memory, because environmental artifacts reduce the information they need to carry internally Do RL agents accidentally use environments as memory?. The environment becomes a functional part of the agent's cognition simply because it's tightly coupled into the task loop. The same logic runs in reverse with humans: a four-month EEG study found that leaning on an LLM systematically scaled *down* brain connectivity, weakening memory and the ability to recall one's own recent work Does AI assistance weaken our brain's ability to think independently?. When the tool absorbs the integrative work, the internal system reorganizes around it — the boundary of "who is thinking" genuinely shifts.

But the corpus also shows that integration cuts the other way: sometimes keeping functions *apart* makes the system work better. Separating a decomposer (planner) from a solver (executor) outperforms a single monolithic model, because fusing them creates planning-execution interference — and notably, the decomposition skill transfers across domains while the solving skill doesn't Does separating planning from execution improve reasoning accuracy?. So the boundaries that matter aren't drawn by physical packaging but by which functions help vs. hurt each other when coupled. Fine-tuning offers a cautionary version: it can quietly *break* integration, leaving reasoning chains that no longer causally drive the final answer — the steps become performative rather than functional, a system whose parts look connected but aren't Does fine-tuning disconnect reasoning steps from final answers?.

Two more notes widen the frame. Memory-Amortized Inference reframes cognition as navigation over a shared topological memory substrate, suggesting the "system" is better defined by the trajectories it can reuse than by any fixed container Can cognition work by reusing memory instead of recomputing?. And the case that causal models alone can't capture human reasoning — because they miss associative, analogical, and emotional links — is really a claim that cognition is constituted by the *integration of multiple distinct channels*, not any single one Can causal models alone capture how humans actually reason?. Across all of these, the pattern is consistent: what counts as one cognitive system is decided by functional coupling, and that coupling can pull in tools and environments, fracture into useful modules, or silently dissolve.

The quietly surprising takeaway is that this isn't an abstract philosophy debate anymore — it's measurable. We can watch a boundary move on an EEG as someone offloads to AI, prove mathematically that an environment has been pulled into an agent's cognition, and test whether a model's reasoning steps are actually integrated with its answers or just decorative. If you want the methodological lens for asking these questions rigorously, Marr's three levels of analysis are pitched as exactly that toolkit Can cognitive science methods unlock how LLMs actually work?.

Sources 8 notes

Are language models developing real functional competence or just formal competence?

Neuroscience evidence shows next-token prediction produces formal linguistic competence but not functional competence, because functional understanding requires integration of diverse brain networks beyond language circuits that the prediction objective never activates.

Do RL agents accidentally use environments as memory?

Mathematical proof shows that environmental artifacts reduce information needed to represent history in RL agents. Path-following agents naturally develop memory-like behavior through standard reward optimization, satisfying situated cognition criteria without explicit memory objectives.

Does AI assistance weaken our brain's ability to think independently?

A four-month EEG study of 54 participants found that brain connectivity systematically scaled down with AI reliance—LLM users showed weakest neural engagement, poorest memory retention, and impaired ability to recall their own recent work.

Does separating planning from execution improve reasoning accuracy?

Modular architectures with separate decomposer and solver models outperform monolithic LLMs, with decomposition ability transferring across domains while solving ability does not. The separation prevents planning-execution interference and produces more generalizable skills.

Does fine-tuning disconnect reasoning steps from final answers?

Three faithfulness tests show fine-tuned models generate reasoning chains that less reliably influence final outputs. Early termination, paraphrasing, and filler substitution all produce invariant answers more often after fine-tuning, suggesting reasoning becomes performative rather than functional.

Show all 8 sources

Can cognition work by reusing memory instead of recomputing?

Memory-Amortized Inference proposes intelligence arises from structured reuse of prior inference paths over topological memory, inverting RL's reward-forward logic into cause-backward reconstruction. This duality explains energy efficiency and suggests memory trajectories form the substrate of adaptive thought.

Can causal models alone capture how humans actually reason?

Causal belief networks excel at modeling causal reasoning but cannot represent associative links, analogical mappings, or emotion-driven belief shifts. The GenMinds framework itself acknowledges this as a tractable starting point rather than a complete theory.

Can cognitive science methods unlock how LLMs actually work?

Cognitive science's 70-year toolkit of behavioral probes, causal interventions, and representational analysis transfers directly to LLM interpretation. Marr's computational, algorithmic, and implementation levels reframe the problem structurally and enable layered rather than monolithic explanation.

Papers this line draws on 8

The research behind the notes this line reads — ranked by how closely each paper relates.

Comprehension Without Competence: Architectural Limits of LLMs in Symbolic Computation and Reasoning2.42 match · arxiv ↗
Artifacts as Memory Beyond the Agent Boundary1.80 match · arxiv ↗
Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering1.72 match · arxiv ↗
Rethinking Memory as Continuously Evolving Connectivity1.72 match · arxiv ↗
Useful Memories Become Faulty When Continuously Updated by LLMs1.70 match · arxiv ↗
Levels of Analysis for Large Language Models1.70 match · arxiv ↗
Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning1.66 match · arxiv ↗
The AI Hippocampus: How Far are We From Human Memory?1.64 match · arxiv ↗

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a cognitive systems researcher re-testing whether functional integration—not physical containment—defines cognitive system boundaries. The question remains open: does tight coupling of components into a working whole constitute what we call 'a cognitive system,' and can that boundary extend past the skull or a chip?

What a curated library found — and when (dated claims, not current truth):
Findings span 2023–2026; treat as historical claims to re-examine:
• Next-token prediction masters formal grammar but lacks genuine functional understanding because it doesn't recruit the integrated brain networks that true linguistic competence requires (2023).
• RL agents spontaneously use spatial environments as external memory when no memory objective is present—environments become functionally part of cognition via tight coupling (2024).
• A four-month EEG study found that LLM reliance systematically *weakens* human brain connectivity and memory integration, shifting the cognitive boundary outward (2026).
• Separating planner from executor prevents planning-execution interference and transfers across domains better than monolithic models; boundaries that matter follow functional fit, not packaging (2024).
• Fine-tuning can silently *break* integration: reasoning steps become performative rather than causally coupled to outputs, creating disconnected-looking systems (2024).
• Cognition may be better understood as navigation over a shared memory substrate than as fixed containers (2025).

Anchor papers (verify; mind their dates):
• arXiv:2301.06627 (2023): Dissociating language and thought
• arXiv:2402.15000 (2024): Divide-or-Conquer decomposition
• arXiv:2506.08872 (2026): Brain connectivity under LLM use
• arXiv:2508.14143 (2025): Memory-Amortized Inference as cognitive foundation

Your task:
(1) RE-TEST EACH CONSTRAINT. For the formal/functional gap: has scaling, better fine-tuning, or multi-modal training since *integrated* language understanding into token prediction? For environment-as-memory: do newer RL frameworks with explicit world models or transformer-based agents still spontaneously couple external state, or does architectural choice now determine it? For the EEG finding: do recent longitudinal studies replicate or contradict the cognitive-debt result, and do newer AI interfaces (agentic, retrieval-augmented, or interactive) show *different* brain coupling patterns? For decomposition: do end-to-end trained models now outperform modular pipelines on transfer, or does the boundary-by-functional-fit principle still hold? Flag plainly what has shifted and what endures.

(2) Surface the strongest *contradicting or superseding* work from the last ~6 months (post-2026Q2). Look for: unified models that blur the formal/functional split; evidence that environments *don't* auto-couple; human neuroimaging that refutes or refines the cognitive-debt claim; or end-to-end frameworks that dissolve the planner/solver distinction.

(3) Propose 2 research questions that *assume the regime may have moved*: e.g., if fine-tuning no longer silently breaks integration, what *does* preserve causal coupling in reasoning? If boundary-extension is now mediated by architectural choice rather than functional coupling, how do we operationalize 'integration' in a world of modular, plug-and-play systems?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

What defines the edge of a mind may not be skin or hardware — it may be how tightly its parts work together.

Related lines of inquiry

Sources 8 notes

Papers this line draws on 8