Can a system without an addressee ever truly tell a joke?
This explores whether joke-telling — a speech act that classically needs someone to tell it *to* — is possible for an LLM, which has no addressee it's oriented toward; the corpus pulls apart two separate failures hiding inside that one question.
This explores whether joke-telling is possible for a system that has no one it's actually speaking to. The corpus suggests the question secretly bundles two different problems: a *mechanical* one (can the system build the structure a joke needs?) and a *relational* one (can it mean the joke at anyone?). They have different answers.
Start with the mechanics, because that's the surprising part. A joke usually turns on a frame snapping into place — a pun where one reading suddenly suppresses another. One line of work argues transformers can't do this on purpose: they integrate every token in weighted parallel rather than selectively suppressing the irrelevant readings, so they read words additively instead of resonantly, which is exactly why wordplay and setups-with-a-turn fail so consistently Why do AI systems miss jokes and wordplay so consistently?. And yet meaning-from-frames doesn't strictly require *referents* — Jabberwocky makes sense out of nonsense purely through syntactic and prosodic cues How do nonsense words create meaning without referents?. So the machinery for frame-resonance isn't wholly absent; it's just not something the model selectively *wields*. Tellingly, when models do handle irony, they over-detect it — flagging far more ironic intent than humans actually use, because ironic examples are loud in training data Do language models overestimate how often irony appears?. That's recognizing the *pattern* of a joke without calibration to a real situation.
The relational problem is the one your phrase 'without an addressee' points straight at, and the corpus treats it as more fundamental. There's a sharp parallel in the argument that an LLM cannot *raise alarm*: alarm is interpersonal address plus felt concern plus proactive initiation, and the model has none — it only ever responds to attention, never solicits it Can language models actually raise alarm about threats?. Telling a joke is the same shape of act: it's offered *to* someone, with an eye on their uptake. A related note pushes harder — AI doesn't produce utterances at all, only 'event-residue' carrying the communicative markers of training data, which the human reader then unilaterally animates into a pseudo-exchange Does AI generate genuine utterances or just text patterns?. On that view the joke has structure only on *your* side; you supply the addressee-orientation the system lacks. The gift-economy framing makes the same point about why this feels hollow: the output carries statistical residue, not the spirit of a giver, because nobody gave it Why doesn't AI output carry the spirit of a giver?.
And there's no hidden joker underneath to rescue it. The simulator is role-play all the way down — no agency, beliefs, or authentic voice beneath the performed character Does a language model have an authentic voice underneath?. So 'truly tell' in your question has nothing to attach to: a joke teller who is only ever a character isn't withholding sincerity, there's just no one home to be sincere.
Here's the twist worth leaving with. The addressee a system lacks can be partly *manufactured* — endowing an agent with an imaginary listener (via Rational Speech Acts, picturing whether an utterance would land as distinctively itself) measurably sharpens its output Can imaginary listeners reduce dialogue agent contradictions?. That reframes your question: maybe a joke doesn't strictly need a *real* addressee, only a *modeled* one. But notice what's still missing even then — when the listener mishears, humans run third-position repair, catching the misread *after* it surfaces, and current systems can't do this at all Can AI systems detect and correct misunderstandings after responding?. The deepest thing a real joke needs isn't an audience in the abstract; it's the live loop of watching one not laugh and adjusting. That loop, not the punchline, is what 'without an addressee' rules out.
Sources 9 notes
Transformers integrate token information through weighted parallel aggregation rather than selective suppression of irrelevant words. This structural difference explains consistent failures with jokes, wordplay, and frame-dependent meaning—not knowledge gaps, but missing cognitive operations.
Jabberwocky achieves sense-of-nonsense through frame-activation on syntactic and prosodic cues alone, proving meaning-making does not require referential content. This reverses compositional accounts and shows frame-resonance is the primary meaning-making operation.
GPT-4o assigns significantly higher irony scores than humans (p < .001), revealing that LLMs detect irony as a pattern but miscalibrate its prevalence because ironic examples are more salient in training data than in actual use.
Alarm is a speech act requiring interpersonal address, felt concern, and proactive initiation. LLMs lack all three: they don't feel concern, can't solicit attention (only respond to it), are reactive not proactive, and alignment training suppresses the overclaiming that alarm requires.
AI output carries communicative markers inherited from training data but lacks the event structure that produces actual utterances. Users supply the missing orientation through interpretive labor, creating a pseudo-event with structure only on the human side.
AI-generated content lacks hau—the spiritual essence that binds gift economies—because no person gave it. This absence is more fundamental than alienation: the output was never anyone's to begin with, so no relationship of obligation forms.
Shanahan argues that base LLMs lack agency, beliefs, or preferences—the simulator is pure role-play with no underlying subject. Jailbreaking reveals the training data's full spectrum, not a hidden true self; even RLHF personas are performed characters, never realized quasi-psychologies.
Endowing dialogue agents with an imaginary listener via Rational Speech Acts reduces persona contradiction at inference time without NLI labels or extra training. The agent simulates whether utterances would distinguish its persona from a distractor, suppressing generic or contradictory responses.
Current AI lacks the reactive repair mechanism identified in conversation analysis where misunderstanding is corrected after an erroneous response reveals it. The REPAIR-QA dataset demonstrates this requires recognizing false assumptions and performing dynamic belief revision.