How do time gaps shape what people discuss across conversation sessions?
Do AI systems account for how elapsed time between conversations changes the way people reference and discuss past events? Current models mostly handle single sessions, but real interactions span days, weeks, and months.
Most chatbot research focuses on single-session dialogue — generating responses based only on the current conversation. But real-world interactions are multi-session: people return to AI systems across days, weeks, and months. Two elements shape these cross-session dynamics that current models ignore:
Time intervals between sessions influence how past events are discussed. A conversation about yesterday's meeting differs from a conversation about a meeting three months ago — in specificity, emotional tone, and relevance. Depending on the time elapsed, responses about past events vary significantly. Previous multi-session datasets had relatively short time ranges, limiting the types of transitions they could capture.
Speaker relationships evolve across sessions. The degree of formality, assumed shared knowledge, and topical expectations shift as interactions accumulate. Fine-grained relationship modeling (not just "stranger" vs "friend" but the specific history of this relationship) is required.
The Conversation Chronicles dataset (1M dialogues) addresses both gaps, using LLM generation with human evaluation to ensure coherent and consistent interactions across sessions. The REBOT model introduces chronological summarization — processing past session context through a temporal lens before dialogue generation, using ~630M parameters.
This connects to the broader context management challenge. Since Why do language models fail in gradually revealed conversations?, the multi-session case is even harder: the model must track not just within-session context but cross-session continuity, temporal distance, and relationship evolution.
The finding that current models have "limited ability that only understands short-term dialogue context" points to a structural gap, not a parameter gap. Adding more parameters or longer context windows does not by itself create sensitivity to temporal dynamics or relationship evolution.
COMEDY's compressive memory as implementation (2402.11975): The COMEDY framework directly addresses these temporal dynamics through compressive memory that tracks three dimensions across sessions: (1) concise event recaps forming a historical narrative, (2) detailed user portraits derived from conversational events, and (3) dynamic relationship changes between user and chatbot. This three-dimensional compression mirrors the temporal dynamics problem: event recaps capture what happened when, user portraits capture evolving preferences, and relationship dynamics capture the interpersonal evolution. By reprocessing and condensing ALL past memories rather than retrieving from a bank, COMEDY inherently prioritizes salient information — a structural advantage over retrieval-based approaches for multi-session continuity.
LOCOMO benchmark (2402.17753): The LOCOMO dataset provides the evaluation infrastructure for very long-term conversations: 300 turns and 9K tokens on average, over up to 35 sessions, grounded on personas and temporal event graphs. This extends the Conversation Chronicles dataset by adding image sharing/reaction capabilities and human verification for long-range consistency.
Inquiring lines that use this note as a source 5
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- What makes human discourse fundamentally temporal in structure?
- How do time gaps between conversations change what chatbots should remember?
- What role do time intervals play in shaping conversation responses?
- What makes two conversation turns the same thread rather than different threads?
- How should AI systems model relationship evolution within a specific ongoing conversation history?
Related concepts in this collection 3
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Do chatbot relationships lose their appeal as novelty wears off?
Explores whether the positive social dynamics observed in one-time chatbot studies persist or fade through repeated interactions. Critical for designing systems intended for sustained engagement over weeks or months.
temporal dynamics of relationship formation across sessions
-
Why do language models fail in gradually revealed conversations?
Explores why LLMs perform 39% worse when instructions arrive incrementally rather than upfront, and whether they can recover from early mistakes in multi-turn dialogue.
multi-session amplifies the multi-turn problem
-
How should chatbot design vary by relationship duration?
Do chatbots serving one-time users need different design than those supporting long-term relationships? This matters because applying the same design to all temporal profiles creates usability mismatches.
the three temporal archetypes create different demands on cross-session dynamics: ad-hoc supporters have no temporal continuity, temporary assistants need medium-term consistency, and persistent companions require the full temporal modeling this note describes
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- Conversation Chronicles: Towards Diverse Temporal and Relational Dynamics in Multi-Session Conversations
- From speaking like a person to being personal: The effects of personalized, regular interactions with conversational agents
- Toward Conversational Agents with Context and Time Sensitive Long-term Memory
- Compress to Impress: Unleashing the Potential of Compressive Memory in Real-World Long-Term Conversations
- Evaluating Very Long-Term Conversational Memory of LLM Agents
- Conversational DNA: A New Visual Language for Understanding Dialogue Structure in Human and AI
- The Emotion-Memory Link: Do Memorability Annotations Matter for Intelligent Systems?
- Summaries, Highlights, and Action items: Design, implementation and evaluation of an LLM-powered meeting recap system
Original note title
time intervals between conversation sessions create dynamics that single-session models miss — responses about past events vary based on elapsed time and speaker relationships