SYNTHESIS NOTE

Can language models communicate without human-readable text?

Do instruction-tuned LLMs have the capacity to generate and decode compressed, non-standard text that preserves semantic meaning while sacrificing human readability? What would be the tradeoffs?

Synthesis note · 2026-06-27 · sourced from Memory

The default assumption of LLM systems is that text should be natural language even when both writer and reader are models. This paper (BabelTele) treats that assumption as an empirical question rather than a constraint, and finds it is mostly habit: instruction-tuned models can generate and re-interpret highly compressed, non-standard textual forms that a human cannot read, while preserving 99.5% semantic fidelity at 27.9% of the original length. Crucially this works zero-shot across proprietary and open-weight families, which suggests the capacity to decode such representations is a general property of instruction-tuned LLMs, not an artifact of one model.

The reasoning matters more than the compression number. Natural language is dense with redundancy — full syntax, discourse markers, narrative coherence — and that redundancy exists to help humans follow, remember, and disambiguate. Strip the human from the loop and the redundancy becomes pure overhead, because the model never needed the scaffolding humans do. This is the same insight from the other side as Why do language models need so much more text than humans?: humans pay a decompression cost that models route around. It also rhymes with Do LLMs compress concepts more aggressively than humans do? — BabelTele is that compression bias turned into an interface design choice.

But the strongest counterargument is auditability. The paper's own framing — BabelTele in agent memory and multi-agent communication — is exactly where opaque inter-model channels become dangerous: if agents converse in text humans cannot read, oversight and debugging collapse. This is the textual cousin of Can agents share thoughts directly without using language?, and inherits the same governance tension. The honest reading is that readability is not free model overhead but a deliberate tax we may want to keep paying for interpretability, even where the model would not require it.

Inquiring lines that use this note as a source 1

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 3

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
13 direct connections · 117 in 2-hop network ·dense cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

human-readable language is a tax LLMs can drop when no human is reading — model-native text holds 99.5% of meaning at 28% of the length