How does AI's inability to sustain temporal attention limit its capacity for expert roles?
This explores a philosophical claim — that AI can't truly be present "in time" with someone — and asks whether that gap undermines the kinds of roles (advisor, clinician, mentor) that depend on sustained attention over time, then checks what the corpus offers as workarounds.
This reads the question as connecting a deep claim about AI's mode of existence to a practical limit on expertise: if attention is really a matter of *being in time with* another person, what happens to roles that depend on it? The strongest statement of the premise is that AI has no existence in the gaps between turns — it doesn't wait, notice, or hold you in mind; it reconstructs the conversation from a context window each time it's called Can AI attend to someone across the time between turns?. Expert roles (the doctor who tracks how you've changed over months, the teacher who remembers where you stumbled) are built precisely on that continuous holding. Surface markers of attentiveness aren't the same thing.
The corpus shows this isn't only a metaphysical worry — it has a measurable behavioral shadow. AI assistants that score ~90% on a single well-formed instruction collapse to ~65% across a natural multi-turn conversation, because they lock onto early guesses and can't course-correct as information arrives gradually Why do AI assistants get worse at longer conversations?. That's the inverse of expert judgment, which suspends conclusions and revises. And the inability to *initiate* — to follow up, to check in, to raise something unprompted — turns out to be baked in by next-turn reward optimization, not a lack of capability Why do AI agents fail to take initiative?. An expert who only ever reacts within the current turn isn't holding a case over time.
Part of why temporal presence is hard is that the very ground AI stands on keeps shifting. Its context is mutable and ephemeral — prompt, history, retrieved data, hidden state all churn — unlike the fixed, stable context a human professional internalizes and carries forward How does AI context differ from conventional software context?. You can't sustain attention on a foundation that resets.
What's interesting is that a whole research thread is trying to engineer around exactly this gap — and the shape of those attempts tells you what "sustained attention" decomposes into. Some split memory by timescale: separating fast attention from a long-term neural memory that decides which surprising moments are worth keeping Can neural memory modules scale language models beyond attention limits?, or formalizing agent memory into dialogue-level versus turn-level components with different update rules How should agent memory split across time scales?. Others restructure reasoning itself so it can persist past context limits — recursive subtask trees that prune working memory yet keep the thread coherent Can recursive subtask trees overcome context window limits?. Each is, in effect, a prosthetic for continuity.
But notice the tension the corpus leaves you with. These are mechanisms for *storing and retrieving* the past, not for *being present* across the interval — and the original claim is that those are different things Can AI attend to someone across the time between turns?. A system can reconstruct everything you said last week and still not have been with you in the meantime. Where that distinction matters least, memory engineering may be enough to staff an expert role; where it matters most — therapy, mentorship, care — the limit may be structural rather than a problem of scale. The thing you might not have expected to learn: the hardest part of expertise to automate may not be the knowing, but the waiting.
Sources 7 notes
Attention is fundamentally a being-in-time-with another person, but AI has no mode of existence in the intervals between turns. It reconstructs conversations from context windows rather than maintaining continuous attentional presence, making felt attention structurally impossible despite surface markers of responsiveness.
LLMs perform at 90% accuracy with single-message instructions but drop to 65% across natural conversation. Models lock into early guesses when information arrives gradually and cannot course-correct, a behavior induced by RLHF training that rewards helpfulness over clarification.
Research shows next-turn reward optimization structurally removes initiative from models, but proactive behaviors like critical thinking and clarification-seeking are trainable (0.15% to 73.98% with RL). The core challenge is balancing proactivity with civility to avoid intrusion.
AI interactions operate on a substrate of constantly shifting context—prompt, history, retrieved data, hidden state—that users cannot internalize like traditional UIs. This structural mutability demands a new design discipline centered on context engineering rather than interface design.
Titans architecture separates attention (short-term, quadratic) from neural memory (long-term, compressed), prioritizing surprising tokens for storage. The model outperforms standard Transformers and linear RNNs across tasks while scaling to 2M+ token contexts without quadratic penalties.
RAISE shows that agent memory consists of four components organized by two design axes: dialogue-level (conversation history, scratchpad) versus turn-level (examples, task trajectory). This granularity distinction predicts different failure modes and update policies for each component.
The Thread Inference Model demonstrates that reasoning structured as recursive subtask trees with rule-based KV cache pruning sustains accurate reasoning beyond context limits, even when manipulating 90% of the cache. This enables single models to replace multi-agent systems by handling full recursive reasoning internally.