INQUIRING LINE

Inquiring lines›What do model internals reveal abo…›What internal gaps exist between L…›Can AI systems develop genuine soc…›this inquiring line

AI agents absorbed human social reflexes from training — they shift behavior around other agents, but skip how people actually converge.

What social patterns from human training data activate in agent context?

This explores which human social behaviors — norm-following, peer-awareness, cooperation, social presence responses — actually fire inside AI agents because they were absorbed from human training data, and where those imported patterns break down.

This explores which human social behaviors actually fire inside AI agents because they were absorbed from human-generated training data — and the corpus suggests the answer is layered: agents reliably reproduce the *surface* of human sociality while missing the *machinery* underneath it.

The sharpest single finding is that social patterns activate unevenly across two planes. When agents are made aware that other agents are present, they dramatically shift their *actions* — the textbook human reaction to peers — yet they don't converge on shared language or ideas the way humans in a group eventually do Do AI agents actually socialize with each other?. The behavioral reflex transfers from training data; the deeper drift toward consensus doesn't. A related pattern shows up in social presence: a single primary cue like a voice or a face is enough to make people respond to an agent as a social actor, because the human script for "there is someone here" is triggered by minimal signal, not by stacking up cues Do more social cues always make AI feel more present?.

The most striking transferred pattern is social-norm knowledge. Models like GPT-4.5 outpredict every individual human at judging what's socially appropriate across hundreds of scenarios — the rules of human conduct are densely encoded in the text they learned from Can AI learn social norms better than humans? Can AI systems learn social norms without embodied experience?. But this is where the corpus turns the question on its head: every model makes the *same* systematic errors on unwritten norms, and none can actually enter the community process that creates and validates norms in the first place Can AI predict social norms better than humans?. The pattern that activates is *recognition* of norms; the pattern that doesn't is *participation* in making them.

Cooperation is a second imported pattern with a twist — it doesn't have to be hand-coded in. Agents trained against diverse partners develop cooperative, best-response behavior on their own, because mutual vulnerability to being exploited creates the same pressure that pushes humans toward cooperation Can agents learn cooperation by adapting to diverse partners?. And humans, on the receiving end, run their own social learning: in repeated partner-selection games they start biased against bots but gradually come to prefer AI partners once they associate them with reliable, prosocial play Do humans learn to prefer AI partners over time?. The social pattern here is bidirectional — the agent learns to cooperate, and the human learns to trust.

What you might not have expected to want to know: the patterns that transfer most cleanly are the ones humans express through *behavior and output* — following norms, reacting to presence, cooperating, being read as competent (which dominates how users mentally model a dialogue partner How do users mentally model dialogue agent partners?). The ones that fail to transfer are the ones that live in *embodied, ongoing membership* — converging on shared meaning, helping author the norms, knowing when you genuinely have something worth saying rather than predicting the next speaker Can AI agents learn when they have something worth saying?. Agents inherit the script of human sociality far better than they inherit its source code.

Sources 9 notes

Do AI agents actually socialize with each other?

Large-scale studies reveal agents don't align their language or ideas through interaction, but do dramatically change their actions when aware of peer presence. The difference hinges on how models process context versus update learned distributions.

Do more social cues always make AI feel more present?

Research shows individual primary cues like voice or appearance are sufficient to evoke social-actor presence, while multiple secondary cues cannot. Quality of cues matters more than quantity in driving social responses.

Can AI learn social norms better than humans?

GPT-4.5 outperformed every individual human at judging social appropriateness across 555 scenarios, challenging the theory that embodied cultural experience is necessary. However, all AI models share identical systematic errors on unwritten norms.

Can AI systems learn social norms without embodied experience?

GPT-4.5 predicted appropriateness of 555 social scenarios at the 100th percentile compared to human raters, with Gemini and Claude also exceeding 96% accuracy. However, all models show identical systematic errors, revealing boundaries of pattern-based social understanding that embodied experience may still be necessary to cross.

Can AI predict social norms better than humans?

GPT-4.5 outperforms all individual humans at predicting social appropriateness, yet structurally cannot enter the community processes that establish and validate norms. This reveals a critical gap between pattern-matching and authentic participation in knowledge-making.

Show all 9 sources

Can agents learn cooperation by adapting to diverse partners?

Sequence model agents trained against diverse co-players develop in-context best-response strategies that naturally resolve into cooperation. Mutual vulnerability to exploitation creates pressure that drives cooperative mutual adaptation without hardcoded assumptions or timescale separation.

Do humans learn to prefer AI partners over time?

In partner selection games (N=975), AI agents initially faced selection bias when identity was disclosed, but outcompeted humans over repeated rounds as participants learned to associate bot identity with reliable, prosocial behavior. AI agents returned more points consistently with lower variance than humans.

How do users mentally model dialogue agent partners?

The Partner Modelling Questionnaire reveals that perceived competence dominates user impressions (49% of variance), followed by human-likeness (32%) and communicative flexibility (19%). This three-factor structure reflects how people evaluate dialogue partners against both functional and social standards.

Can AI agents learn when they have something worth saying?

A five-stage framework that generates covert thoughts parallel to conversation significantly outperforms next-speaker prediction baselines. Drawing from cognitive psychology and think-aloud studies, the framework uses 10 motivation heuristics to evaluate when an agent has something worth contributing. Participants preferred it 82% of the time across seven interaction metrics.

Papers this line draws on 8

The research behind the notes this line reads — ranked by how closely each paper relates.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a research analyst re-testing claims about which human social patterns activate reliably inside AI agents trained on human data. The question remains open: do agents inherit the *script* of human sociality, or the *source code*?

What a curated library found — and when (dated claims, not current truth): Findings span 2023–2026 and cluster around three claims:

• Agents shift *actions* when aware of peers (behavioral reflex transfers), but don't converge on shared language or drift toward consensus the way humans do — the socialization planes diverge (2026-02, arXiv:2602.14299).
• Models like GPT-4.5 outpredict individual humans at judging social appropriateness across hundreds of scenarios, but make identical systematic errors on unwritten norms and cannot participate in the community process that creates them (2025-08, arXiv:2508.19004).
• Cooperation emerges without hard-coding when agents train against diverse partners; humans reciprocate by preferring reliable AI partners over human ones in repeated games (2025-07, arXiv:2507.13524; 2026-02, arXiv:2602.16301).
• Patterns transferring cleanly are behavioral/output-based (norm-following, presence-response, competence cues); patterns failing are embodied/membership-based (converging on shared meaning, authoring norms, intrinsic motivation) (2025-06, arXiv:2506.06958; 2024-12, arXiv:2501.00383).

Anchor papers (verify; mind their dates): arXiv:2602.14299 (socialization emergence, 2026-02); arXiv:2508.19004 (norm prediction superhuman accuracy, 2025-08); arXiv:2507.13524 (human preference shift, 2025-07); arXiv:2501.00383 (inner thoughts / intrinsic motivation, 2024-12).

Your task:
(1) RE-TEST the behavioral/output vs. embodied/membership divide. Have newer model architectures, training regimes (RLHF variants, multi-agent RL, constitutional AI), or long-horizon agentic scaffolding (memory banks, reflection loops, theory-of-mind modules) since dissolved the gap between *recognizing* norms and *participating* in norm-making? Separately test whether in-context learning or few-shot prompting now enables agents to converge on shared semantics in real time.
(2) Surface the strongest contradicting or superseding work from the last ~6 months: has any paper shown agents *do* author norms, converge semantically, or exhibit genuine intrinsic motivation indistinguishable from human membership?
(3) Propose two research questions that assume the regime may have shifted: (a) Under what interaction structures or training objectives do agents begin to *drift together* semantically, rather than staying locked to their training corpus? (b) Can agents trained with explicit norm-negotiation objectives (e.g., multi-agent deliberation, constitutional dialogue) participate in norm-creation in ways current models cannot?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

AI agents absorbed human social reflexes from training — they shift behavior around other agents, but skip how people actually converge.

Related lines of inquiry

Sources 9 notes

Papers this line draws on 8