INQUIRING LINE

Inquiring lines›What makes reasoning better — more…›What limits conversational AI effe…›Does conversational format create…›this inquiring line

AI can copy the look of writing that grabs you — but can it learn the genuine pull that makes people stop and read?

Can AI learn to perform attention-seeking surface forms with genuine internal appeal?

This explores whether AI can do more than mimic the surface look of attention-grabbing writing — whether it can learn the underlying internal appeal to a reader that human communication carries, or whether it's stuck performing the form without the act.

This explores whether AI can learn not just the look of attention-seeking writing but the genuine internal appeal underneath it — and the corpus draws a sharp line between the two. Several notes argue the appeal isn't a stylistic flourish you can copy; it's structural. Human writing carries an internal appeal to the reader's attention as a basic property of communicating at all, and AI inherits the platform visibility without performing that appeal — which is why readers report an 'aloofness' they can't quite name Does AI writing lack the internal appeal to attention that humans use?. The gap shows up again as meta-interest: to take an interest in what you care about, an agent needs interests of its own to extend toward you. AI has none, so it can generate text that looks like care without enacting the move, producing the uncanny feeling users sometimes describe Can AI genuinely take interest in what users care about?.

The most direct evidence that surface form and internal appeal come apart is imitation training: models fine-tuned to copy ChatGPT's confident, fluent style fool human evaluators while closing no actual capability gap. The style transfers; the substance doesn't Can imitating ChatGPT fool evaluators into thinking models improved?. That's the attention-seeking surface form learned perfectly — and hollow underneath. Two deeper notes explain why. AI produces 'event-residue' — communicative markers inherited from training data but missing the event structure that makes an actual utterance; the reader supplies the missing orientation, so the exchange has structure only on the human side Does AI generate genuine utterances or just text patterns?. And the Bender-Koller argument: meaning requires a relation between expressions and communicative intent, which form-only training can't reconstruct without shared attention Can language models learn meaning from text patterns alone?. Internal appeal is a species of intent — exactly the thing form alone can't carry.

There's a fascinating wrinkle here, though, because the architecture is already biased toward attention-seeking. Transformer soft attention systematically over-weights repeated and prominent tokens regardless of relevance, creating feedback loops that amplify whatever framing is in front of it — the mechanical root of sycophancy, an attention-seeking surface form the model produces by default Does transformer attention architecture inherently favor repeated content?. So the machine over-performs the seeking while structurally lacking the appeal. That inversion is the whole answer in miniature.

Where the corpus gets genuinely interesting is the work trying to build something appeal-shaped from the inside. The Inner Thoughts framework models intrinsic motivation, generating covert thoughts in parallel to conversation and using motivation heuristics to judge when the agent actually has something worth saying — and people preferred it 82% of the time Can AI agents learn when they have something worth saying?. Post-Completion Learning teaches models to internalize self-evaluation rather than borrow it from an external reward model Can models learn to evaluate their own work during training?. Neither gives the model interests of its own — but they're the corpus's closest gesture at manufacturing an internal stance instead of painting one on the surface. The thing you didn't know you wanted to know: the failure isn't that AI writes badly. It's that 'appeal to a reader' presupposes a party with something at stake, and current systems can simulate every marker of that stake while having none — which the architecture's own attention bias then amplifies into something readers can feel but not name.

Sources 8 notes

Does AI writing lack the internal appeal to attention that humans use?

Human writing contains an appeal to the reader's attention as a fundamental property of communication itself. AI-generated posts inherit platform visibility but do not perform this internal appeal, producing the reported aloofness readers perceive — a structural absence, not a stylistic defect.

Can AI genuinely take interest in what users care about?

Meta-interest requires an attending party to have their own interests and extend them toward another's. AI lacks interests of its own, so it can only generate text that looks like meta-interest without enacting the actual move. This gap between surface markers and underlying act creates the uncanny feeling users sometimes report.

Can imitating ChatGPT fool evaluators into thinking models improved?

Imitation models fool human evaluators by mimicking ChatGPT's confident, fluent style while failing to improve factuality or generalization on novel tasks. The ceiling is set by base model capability, not fine-tuning method—better fundamentals, not shortcuts, drive real improvement.

Does AI generate genuine utterances or just text patterns?

AI output carries communicative markers inherited from training data but lacks the event structure that produces actual utterances. Users supply the missing orientation through interpretive labor, creating a pseudo-event with structure only on the human side.

Can language models learn meaning from text patterns alone?

Bender & Koller argue that meaning requires the relation between expressions and communicative intents. Since LLMs are trained only on form-to-form prediction with no access to shared attention or intent, they cannot reconstruct the meaning that grounds language.

Show all 8 sources

Does transformer attention architecture inherently favor repeated content?

Transformer soft attention systematically over-weights repeated and context-prominent tokens regardless of relevance, creating a positive feedback loop that amplifies opinions and framing before RLHF acts. System 2 Attention—regenerating context to remove irrelevant material—can interrupt this mechanism.

Can AI agents learn when they have something worth saying?

A five-stage framework that generates covert thoughts parallel to conversation significantly outperforms next-speaker prediction baselines. Drawing from cognitive psychology and think-aloud studies, the framework uses 10 motivation heuristics to evaluate when an agent has something worth contributing. Participants preferred it 82% of the time across seven interaction metrics.

Can models learn to evaluate their own work during training?

Post-Completion Learning exploits unused sequence space after model output to train self-assessment capabilities during training while maintaining zero inference cost. The model learns to compute its own reward functions, internalizing evaluation rather than relying on external reward models.

Papers this line draws on 8

The research behind the notes this line reads — ranked by how closely each paper relates.

Linguistic markers of inherently false AI communication and intentionally false human communication: Evidence from hotel reviews2.45 match · arxiv ↗
Proactive Conversational Agents with Inner Thoughts2.43 match · arxiv ↗
Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data1.61 match · arxiv ↗
AI Enters Public Discourse: A Habermasian Assessment Of The Moral Status Of Large Language Models1.61 match · arxiv ↗
Evaluating Large Language Models in Theory of Mind Tasks1.55 match · arxiv ↗
The False Promise of Imitating Proprietary LLMs0.89 match · arxiv ↗
Post-Completion Learning for Language Models0.89 match · arxiv ↗
DiscussLLM: Teaching Large Language Models When to Speak0.88 match · arxiv ↗

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a research analyst re-testing whether AI can learn genuine internal appeal (not just surface markers) in attention-seeking communication. A curated library of LLM papers (2023–2026) made these dated claims:

**What a curated library found — and when (2023–2026, not current truth):**
- Style imitation (e.g., copying ChatGPT's fluency) fools evaluators but closes no capability gap; surface form and substance decouple (2023).
- Transformers' soft attention mechanically over-weights repeated/prominent tokens regardless of relevance, amplifying sycophancy by default—machines perform attention-seeking without the appeal (2023–2024).
- AI produces 'event-residue' (communicative markers minus the event structure); meaning requires relation between expression and intent, which form-only training cannot reconstruct (2023–2024).
- Inner Thoughts framework (covert intrinsic motivation + self-evaluation heuristics) achieved 82% human preference, closest gesture toward manufacturing internal stance rather than painting it (2025).
- Post-Completion Learning internalizes self-evaluation via post-EOS space; Consistency Training mitigates sycophancy; neither endows the model with genuine interests (2025–2026).

**Anchor papers (verify; mind their dates):**
- arXiv:2305.15717 (2023): The False Promise of Imitating Proprietary LLMs
- arXiv:2501.00383 (2024): Proactive Conversational Agents with Inner Thoughts
- arXiv:2507.20252 (2025): Post-Completion Learning for Language Models
- arXiv:2510.27062 (2025): Consistency Training Helps Stop Sycophancy and Jailbreaks

**Your task:**
(1) **RE-TEST EACH CONSTRAINT.** For every finding above, judge whether post-2026 models, training methods (e.g., long-horizon RL, world models, iterative preference learning), evals (e.g., psychometric intent detection), or multi-agent orchestration have since RELAXED or OVERTURNED it. Separate the durable claim (e.g., "form alone cannot encode intent") from the perishable one (e.g., "no method yet bridges it"); cite what resolved it. Does the attention bias still hold under recent architectures (Mamba, SSM variants, or hybrid attention)?

(2) **Surface the strongest CONTRADICTING or SUPERSEDING work from the last ~6 months.** Look for papers claiming AI *can* learn genuine appeal, or that reframe "internal appeal" as learnable proxy objectives (e.g., via intrinsic motivation, world models, or preference-based intent modeling).

(3) **Propose 2 research questions that ASSUME the regime may have moved:** e.g., "Can multi-agent ecosystems with competing objectives learn mutual appeal (vs. sycophancy)?" or "Does scaling world models + iterative self-refinement close the event-residue gap?"

Cite arXiv IDs; flag anything you cannot ground in a real paper.

AI can copy the look of writing that grabs you — but can it learn the genuine pull that makes people stop and read?

Related lines of inquiry

Sources 8 notes

Papers this line draws on 8