TOPIC

Speech and Voice

4 synthesis notes · 10 source papers
View as

Why do dialogue systems need probabilistic reasoning?

Explores whether deterministic flowchart-based dialogue systems can handle realistic speech recognition error rates of 15-30 percent, and what alternative approaches might be necessary.

Explore related Read →

Can skipping transcription make voice assistants faster?

Voice assistants traditionally convert speech to text before responding. Does eliminating that middle step reduce latency enough to matter for real-time conversation?

Explore related Read →

What speech tasks remain without standardized benchmarks?

Speech evaluation has strong benchmarks for transcription and translation, but broader comprehension and reasoning tasks over audio lack standardized measurement. This gap may constrain which capabilities researchers prioritize building.

Explore related Read →

Do speech models learn language-specific sounds or universal physics?

Exploring whether self-supervised speech models encode phonetic categories tied to specific languages or instead capture the underlying vocal-tract physics common to all humans. This matters for understanding why these models transfer across languages without retraining.

Explore related Read →

Source papers 10

The Arxiv papers behind this sub-topic. Links may take you off-site to arxiv.org.