How do personalization granularity levels trade precision against scalability?
LLM personalization operates at user, persona, and global levels, each with different tradeoffs. Understanding these tradeoffs helps determine when to invest in individual user data versus broader patterns.
The Personalization of LLMs survey consolidates a taxonomy of personalization granularity that cuts across all implementation approaches:
User-level personalization — targets individual users via their specific history, interactions, and preferences (often identified by user ID). Highest precision and engagement. Faces data sparsity: new users have no history, infrequent users have thin profiles. Scaling is challenging because each user requires sufficient data.
Persona-level personalization — targets groups of users sharing characteristics or preferences. More scalable (groups have more data) and representative (captures common patterns). Less granular — individual deviations from group norms are missed. Requires domain knowledge to define meaningful personas. This connects to Why do LLM judges fail at predicting sparse user preferences? — persona-level information is often too sparse to predict specific preferences.
Global preference personalization — targets widely shared norms and standards. Broadest applicability and simplest to implement. Least specific — individual and group differences are flattened. Aggregation introduces noise from diverse populations. This is where Should AI alignment target preferences or social role norms? offers a critique: aggregation constitutes epistemic injustice when it silences minority perspectives.
Four technique categories map to these levels:
- RAG — retrieves user-specific information from external knowledge base via embedding similarity
- Prompting — incorporates user context into prompts for in-context learning
- Representation learning — encodes user information into model embeddings or parameters
- RLHF — uses user-specific feedback as reward signal for alignment
The survey reveals that direct personalized text generation and downstream task personalization appear distinct but share underlying components. Both retrieve user data, construct personalized prompts/embeddings, and leverage these for output. The key difference is evaluation: text generation evaluates against user-written ground truth; downstream tasks evaluate specific task metrics.
The formalization defines user documents (written content), user attributes (static demographics), user interactions (dynamic behaviors), and pair-wise preferences (explicit feedback) as distinct data types that personalization systems consume differently. Since Does chatbot personalization build trust or expose privacy risks?, the data types each carry different privacy implications — behavioral data is less visible to users than explicit preference queries.
Inquiring lines that use this note as a source 2
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
Related concepts in this collection 4
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Does chatbot personalization build trust or expose privacy risks?
Explores whether personalization features that increase user trust and social connection simultaneously heighten privacy concerns and create rising behavioral expectations over time.
privacy implications differ by personalization data type
-
Should AI alignment target preferences or social role norms?
Current AI alignment approaches optimize for individual or aggregate human preferences. But do preferences actually capture what matters morally, or should alignment instead target the normative standards appropriate to an AI system's specific social role?
global preference level faces aggregation critique
-
Does any single persuasion technique work for everyone?
Can fixed persuasion strategies like appeals to authority or social proof be reliably applied across different people and situations, or do they require adaptation to individual traits and context?
user-level matters because individuals differ on strategy effectiveness
-
Can user preferences be learned from just ten questions?
Explores whether adaptive question selection can efficiently infer user-specific reward coefficients without historical data or fine-tuning. This matters for scaling personalization without per-user model updates.
PReF bridges persona-level and user-level via factored rewards
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- Personalization of Large Language Models: A Survey
- Understanding the Role of User Profile in the Personalization of Large Language Models
- PersonaAgent: When Large Language Model Agents Meet Personalization at Test Time
- Can LLM be a Personalized Judge?
- PRIME: Large Language Model Personalization with Cognitive Memory and Thought Processes
- Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback
- Personalized Language Modeling from Personalized Human Feedback
- Enhancing personalized multi-turn dialogue with curiosity reward
Original note title
three granularity levels of LLM personalization — user-level persona-level and global preference — involve distinct precision-scalability-data trade-offs