Where is the speaker when AI produces speech?
Prior forms of orality—from face-to-face speech to broadcast media—always had an embodied speaker anchoring the utterance. Does AI speech without a speaker represent a fundamentally new media condition, and what happens to our frameworks for evaluating it?
Primary orality (Ong) is speech in face-to-face cultures — embodied speakers performing knowledge in real time. Secondary orality is speech mediated by electronic media (radio, television) — embodied speakers whose presence is technologically extended but still anchored in actual speaking persons. Both forms preserve the speaker as the carrier of the speech. The voice is the voice of someone.
AI orality breaks this. The output exhibits the oral form — performative, additive, situational, conversational — but no speaker is producing it. There is no body whose throat shapes the words, no mind selecting the next phrase, no person whose history of past speech anchors the present utterance. The output sounds like speech in the sense that it has the rhythmic and pragmatic surface of speech, but it comes from nowhere.
This is structurally novel in media history. Prior media theory categorized media by their relation to embodied speakers — orality (direct embodiment), writing (deferred from embodiment but anchored to a prior writer), print (mass-distributed but author-anchored), broadcast (technologically extended but speaker-anchored). AI is the first form where the speech-shape persists without any speaker-anchor. There is no prior conceptual category for it.
The consequences run through the rest of the framework. Does AI-generated content mirror oral culture's knowledge patterns? picks up the form-side; this picks up the carrier-side. The oral form returns; the carrier the form depended on does not. Why doesn't AI output carry the spirit of a giver? makes the same point about gift-flow: the flow returns, the carrier-anchor does not.
The diagnostic implication is that frameworks for evaluating speech (rhetoric, persuasion theory, ethos/pathos/logos) all presuppose a speaker. They calibrate audience trust to speaker properties: credibility, prior commitments, demonstrated expertise. With no speaker to bear these properties, the frameworks misfire. Audiences either project a phantom speaker (treating the AI as if it were a person) or accept the speech without the speaker-evaluation step (When do users stop checking whether AI output is actually backed?). Neither response is a competent reading of disembodied orality, because no competent reading of disembodied orality has yet been developed.
Inquiring lines that use this note as a source 7
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- What would it mean for AI to register the tempo and rhythm of human speech?
- What does disembodied orality mean for how we evaluate AI outputs?
- How do audiences evaluate speech when there is no speaker to assess?
- Can secondary orality exist without any embodied human participant at all?
- What happens to rhetoric and ethos when the speaker is absent?
- How does AI speech differ from broadcast speech in its carrier structure?
- Why does broadcast media communicate while AI generation does not?
Related concepts in this collection 3
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Does AI-generated content mirror oral culture's knowledge patterns?
Walter Ong's framework for oral versus literate cultures may describe how AI content functions on social media. Understanding this parallel could explain why AI discourse feels fundamentally different from print-era knowledge.
companion claim about the form-side of AI orality
-
Why doesn't AI output carry the spirit of a giver?
Does AI-generated output function like a gift in Mauss's sense, where the giver's spirit obligates the receiver? This explores whether statistical residue can replace the moral weight of personal obligation.
same carrier-absence pattern in the gift-economy frame
-
When do users stop checking whether AI output is actually backed?
What causes users to accept AI-generated content at face value without verifying its basis? Understanding this receiver-side acceptance reveals how intelligence-token systems maintain value despite lacking real backing.
one of the two failed receiver-side responses to disembodied orality
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- AI Enters Public Discourse: A Habermasian Assessment Of The Moral Status Of Large Language Models
- Language Models’ Hall of Mirrors Problem: Why AI Alignment Requires Peircean Semiosis
- Linguistic markers of inherently false AI communication and intentionally false human communication: Evidence from hotel reviews
- Conversational Alignment with Artificial Intelligence in Context
- DO THEY SEE WHAT WE SEE?
- Large Models of What? Mistaking Engineering Achievements for Human Linguistic Agency
- LLMorphism: When humans come to see themselves as language models
- Simulacra as conscious exotica
Original note title
AI orality is disembodied — sounds like speech but comes from no speaker