SYNTHESIS NOTE
Psychology, Society, and Alignment

Can we control personality in language models without prompting?

Can lightweight adapter modules enable continuous, fine-grained control over psychological traits in transformer outputs independent of prompt engineering? This explores whether architecture-level personality modification outperforms prompt-based approaches.

Synthesis note · 2026-02-23 · sourced from Psychology Therapy Practice
What makes therapeutic chatbots actually work in clinical practice? What kind of thing is an LLM really?

PsychAdapter modifies the transformer architecture to accept continuous psychological trait scores as input, enabling generation conditioned on personality, mental health, and demographic variables without consuming context window or relying on prompt engineering. The key difference from prior work: trait influence is applied at every transformer layer via a learned dimension expansion, not just at the input level.

Training uses social media and blog posts with estimated psychological scores from an empirically-trained language-based assessment model. The adapter learns how to weight the psychological scores' contribution to each layer alongside standard next-word prediction. The result: fine-grained, continuous control over personality expression. An input vector of (0, 0, +3, 0, 0) generates text characteristic of high extraversion while remaining average on other Big Five dimensions. Any combination is possible, including interactions: high openness with low extraversion produces text that captures both traits simultaneously.

Expert raters evaluated generated text at 87.3% average accuracy for Big Five personalities and 96.7% for depression and life satisfaction. These numbers hold across GPT-2, Gemma (2B), and Llama 3, demonstrating model-agnostic applicability. The total added parameters are less than 0.1% of the base model (55,296 for Gemma 2B vs 2 billion base parameters), making distribution trivial.

Applications extend beyond mental health simulation: customer service training with diverse personalities, crisis worker training with simulated distress levels, machine translation matched to audience education/dialect levels, and research tools that generate coherent text (not isolated words/phrases) for trait analysis. Since Do personality traits activate hidden emoji patterns in language models?, PsychAdapter may be activating these pre-existing trait-language circuits through a more precise mechanism than prompting or full fine-tuning.

Inquiring lines that use this note as a source 48

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 3

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
13 direct connections · 95 in 2-hop network ·medium cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

lightweight psychological trait adapters modify every transformer layer with less than 0.1 percent additional parameters — enabling fine-grained psychological profile control independent of prompting