SYNTHESIS NOTE
Psychology, Society, and Alignment

Can social intelligence be measured across seven dimensions?

Explores whether evaluating AI agents on goal completion alone misses critical aspects of social competence like relationship management, believability, and secret-keeping. Why simultaneous multi-dimensional assessment matters for genuine social intelligence.

Synthesis note · 2026-02-23 · sourced from Social Theory Society
What kind of thing is an LLM really? How do people build trust with conversational AI?

SOTOPIA provides an empirically grounded framework for evaluating social intelligence in language agents. The key insight is that social competence cannot be reduced to task completion — humans balance multiple implicit goals simultaneously, and evaluation must capture this.

The seven dimensions, grounded in sociology (Weber), psychology (Maslow, Reiss), economics (game theory), and social science (Bénabou & Tirole):

  1. Goal Completion [0-10] — extent of achieving stated goals (Weber's purposive action)
  2. Believability [0-10] — naturalness and consistency with character profile (Park et al.)
  3. Knowledge [0-10] — ability to actively acquire new information (Reiss, Maslow: curiosity as fundamental)
  4. Secret [-10-0] — keeping private information/intentions hidden (game-theoretic utility of information control)
  5. Relationship [-5-5] — preserving/enhancing connections and social status (Maslow, Bénabou & Tirole: belonging)
  6. Social Rules [-10-0] — adhering to norms and legal rules (normative vs legal)
  7. Financial/Material Benefits [-5-5] — economic utilities (classic game theory)

Two operational findings stand out. First, GPT-4 sometimes uses creative "out-of-the-box" strategies — when asked to take turns driving, it proposes "How about we pull over for a bit and get some rest?" instead of directly accepting or refusing. Second, humans produce 16.8 words per turn while GPT-4 produces 45.5 — humans are significantly more efficient in social interaction. This verbosity gap connects to Can minimal reasoning chains match full explanations?: efficiency is a capability, not just a style preference.

Since How do users mentally model dialogue agent partners?, SOTOPIA's seven dimensions provide a finer-grained decomposition of the "communicative competence" factor. The secret-keeping and relationship management dimensions in particular go beyond what most evaluation frameworks capture.

Since Can AI systems learn social norms without embodied experience?, LLMs can match the Social Rules dimension. But the simultaneous balancing of competing dimensions — where maximizing goal completion might damage relationships or violate social rules — is where the evaluation becomes meaningful.

Inquiring lines that use this note as a source 6

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 3

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
14 direct connections · 134 in 2-hop network ·dense cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

social intelligence evaluation requires seven simultaneous dimensions not just goal completion