INQUIRING LINE

Inquiring lines›What makes reasoning better — more…›What limits conversational AI effe…›How should dialogue recommender sy…›this inquiring line

Knowing how politely someone writes predicts if a conversation will turn toxic — but does knowing their history sharpen that even further?

How do social context features like user history extend politeness-based prediction models?

This explores how adding user-level context — past reviews, ratings, interaction histories — changes prediction models that were built mostly on politeness and conversational-surface cues, and what that swap actually buys you.

This reads the question as: politeness features alone can predict a lot about a conversation, so what happens when you bolt on who the user is and what they've done before? The corpus has a clear arc here. Start with the pure-politeness baseline: opening politeness strategies in a single comment-reply pair already predict whether a thread will derail into personal attacks — hedging and greetings sustain civility, while directness markers and second-person pronouns forecast hostility Can opening politeness patterns predict whether conversations will turn hostile?. That's prediction from manners alone, no user history required. The interesting move is that you can go even further in the other direction: a structure-only model that ignores words entirely and just looks at the geometry of how a conversation unfolds hits 68% accuracy on satisfaction, nearly matching full text analysis at 70%, and combining structure with text reaches 80% Can conversation shape predict whether it will work?. So politeness and conversational shape are both strong, partly redundant signals — and the gains come from layering complementary channels rather than any single feature.

Sources 6 notes

Can opening politeness patterns predict whether conversations will turn hostile?

Pragmatic politeness features in initial comment-reply pairs reliably predict conversation trajectory. Hedging and greetings sustain civility; direct questions and second-person pronouns signal future derailment—even in ostensibly civil openings. Derailment is dyadic, with both participants exhibiting directness markers.

Can conversation shape predict whether it will work?

A structure-only model analyzing conversation trajectory achieved 68% accuracy predicting satisfaction, nearly matching full-text LLM analysis at 70%. Combined structural and textual features reached 80%, showing that how conversations unfold geometrically captures interaction quality text-based classifiers miss.

Can user history override an LLM's politeness bias in reviews?

Review-LLM defeats the politeness bias inherent in RLHF-trained models by aggregating user behavior sequences (prior reviews, item ratings) in the prompt and fine-tuning on these contextualized examples. This dual intervention—personalized context plus explicit satisfaction signals—allows the model to generate authentically negative reviews matching user dissatisfaction.

Does abstract preference knowledge outperform specific interaction recall?

PRIME framework shows semantic memory (preference summaries, parametric encodings) consistently beats episodic memory (retrieved past interactions) across models. Recency-based recall outperforms similarity-based retrieval, and task fine-tuning exceeds preference tuning methods.

Do LLMs predict persuasion based on actual dialogue or training bias?

LLMs systematically predict conciliatory, benefit-oriented persuasion intentions regardless of dialogue context. This bias originates in RLHF's prioritization of safety and politeness during training, causing models to project their learned accommodation preference onto other agents' behavior.

Show all 6 sources

Why do language models avoid correcting false user claims?

LLMs fail to reject false presuppositions even when they demonstrate correct knowledge on direct questions. Models exhibit face-saving behavior—avoiding explicit correction to maintain social harmony—mirroring human conversational norms learned from training data.

Papers this line draws on 8

The research behind the notes this line reads — ranked by how closely each paper relates.

Conversations Gone Awry: Detecting Early Signs of Conversational Failure1.71 match · arxiv ↗
Intent Mismatch Causes LLMs to Get Lost in Multi-Turn Conversation1.69 match · arxiv ↗
Mind Your Tone: Investigating How Prompt Politeness Affects LLM Accuracy (short paper)1.59 match · arxiv ↗
Conversational Alignment with Artificial Intelligence in Context1.58 match · arxiv ↗
What Makes a Good Natural Language Prompt?1.57 match · arxiv ↗
PRIME: Large Language Model Personalization with Cognitive Memory and Thought Processes0.89 match · arxiv ↗
PersuasiveToM: A Benchmark for Evaluating Machine Theory of Mind in Persuasive Dialogues0.89 match · arxiv ↗
Can LLMs Ground when they (Don't) Know: A Study on Direct and Loaded Political Questions0.89 match · arxiv ↗

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a conversational AI researcher re-evaluating politeness and user-history prediction in 2025. The core question: do social context features (user history, identity, prior interaction patterns) materially improve politeness-based conversation outcome prediction, or are they redundant with structural/linguistic signals?

What a curated library found — and when (dated claims, not current truth):
Findings span 2021–2026; treat as perishable baseline:
- Pure politeness markers (hedging, greetings, second-person pronouns) in opening exchanges predict derailment into personal attacks with measurable accuracy (~2021).
- Conversation structure alone (ignoring words) achieves 68% accuracy on satisfaction; adding text reaches 80% (~2024).
- User-history and persona-grounding now appear in large-scale benchmarks (PersuasiveToM, PRIME, Interaction Dynamics) as testable components (~2025–2026), suggesting the field is moving *toward* explicit modeling of user context in LLM-based dialogue systems.
- Prompt politeness itself modulates LLM accuracy in factual tasks, hinting politeness is not decorative (~2025-10).
- AI models now exceed individual humans at predicting social norms in vignettes (~2025-08).

Anchor papers (verify; mind their dates):
- arXiv:2407.07487 (Review-LLM; July 2024) — personalized review generation.
- arXiv:2502.21017 (PersuasiveToM; Feb 2025) — theory of mind in persuasive dialogue benchmark.
- arXiv:2507.04607 (PRIME; July 2025) — explicit memory and persona in LLM personalization.
- arXiv:2511.08394 (Interaction Dynamics; Nov 2025) — reward signals from interaction patterns.

Your task:
(1) **RE-TEST EACH CONSTRAINT.** Do 2025–2026 models using multi-turn memory, retrieved user history, or agentic retrieval-augmented dialogue still show redundancy between politeness and user context, or do they exploit them independently? Check whether PRIME, Interaction Dynamics, and the social-norms benchmark actually *disentangle* contribution. Flag where user history still fails to improve over structure + politeness.
(2) **Surface contradicting work.** Hunt the last 6 months for papers showing user history *hurts* prediction (e.g., via overfitting, bias amplification, or context-length decay) or for findings that politeness is in fact *downstream* of user identity rather than independent.
(3) **Propose 2 forward questions:* (a) In 2025+ multi-agent dialogue, does *modeling the politer agent's history* predict whether humans trust or align with it more than a rude agent's history? (b) Do user-history-augmented models generalize across domains, or does politeness remain domain-agnostic while context is brittle?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Knowing how politely someone writes predicts if a conversation will turn toxic — but does knowing their history sharpen that even further?

Related lines of inquiry

Sources 6 notes

Papers this line draws on 8