SYNTHESIS NOTE
Training, RL, and Test-Time Scaling Reasoning, Retrieval, and Evaluation Language, Text, and Discourse

Does preference tuning always reduce diversity the same way?

Explores whether the standard narrative that RLHF reduces model diversity holds equally across different task domains, or if the effect varies by what the domain rewards.

Synthesis note · 2026-05-18 · sourced from Evaluations

A clean finding from Evaluating the Diversity and Quality of LLM Generated Content that the standard "RLHF reduces diversity" narrative cannot accommodate: the direction of the effect depends on the domain. In programming tasks, preference tuning consistently reduces lexical and syntactic diversity while preserving semantic diversity. In open-ended creative writing, preference tuning increases lexical and syntactic diversity, including stylistic variety.

The pattern makes sense in retrospect. Code has a sharp, narrow definition of "correct" — semantically equivalent programs converge on a small set of valid syntactic forms. Preference tuning pushes models toward correctness, which in code means pushing toward a smaller surface lexicon. Creative writing has the opposite property: "good" creative writing rewards distinctive word choice, varied sentence structure, stylistic range. Preference tuning pushes models toward those rewards, which manifests as broader lexical and syntactic variety.

This breaks the assumption that diversity is a single property of the model. A model that has been preference-tuned is not "less diverse" in the absolute sense — it is differently shaped depending on what the domain rewards. For code-heavy applications, the lexical compression is a feature (consistent style) or a bug (less exploration of solution space) depending on what you want. For creative applications, the lexical expansion is a clear win.

The implication for evaluation is that benchmarks that measure diversity in a domain-agnostic way will report misleading aggregate numbers. A model that scores 60th percentile on "creative writing diversity" and 90th percentile on "code diversity" averages to a middling number that hides both ends of the actual capability distribution. Domain-stratified diversity evaluation is necessary to characterize what preference tuning has done to a model.

For builders, this dissolves part of the "should we preference-tune for creativity?" debate. The answer depends on whether the desired creativity is the convergent kind (programs that work) or the divergent kind (stories that distinguish themselves) — and on those terms, preference tuning is well-aligned with the second.

Inquiring lines that use this note as a source 107

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 2

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
12 direct connections · 119 in 2-hop network ·dense cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

preference tuning diversity effects are domain-dependent — RLHF reduces lexical-syntactic diversity in code while increasing it in creative writing