Workplace Applications

Why does AI default to coaching instead of doing?

In workplace conversations, users often want AI to execute tasks like writing or gathering information, but AI tends to explain and advise instead. What drives this systematic mismatch between what users need and what AI provides?

Does concentrated AI exposure enable workers to adapt and reallocate?

When AI displaces specific tasks rather than spreading across many, workers may shift effort to non-displaced tasks within their occupation. Does this reallocation mechanism actually offset employment losses?

Can governance rules embedded in runtime memory actually protect autonomous agents?

Explores whether safeguards woven into an agent's operating loop—rather than documented separately—remain durable and retrievable when most needed. Tests whether runtime governance is engineering solution or false assurance.

What happens to human wages in an AGI economy?

Does human labor retain economic value when AGI can replicate most work? This explores whether wages would reflect the computational cost of replacement rather than the value workers actually produce.

Do LLM research ideas actually hold up when experts try to execute them?

Explores whether LLM-generated ideas maintain their apparent novelty advantage when expert researchers spend 100+ hours implementing them. Matters because ideation-stage evaluation may not capture real-world feasibility barriers.

Can LLMs efficiently generate taxonomies and label training data?

Explores whether large language models can automate both taxonomy generation and data labeling to reduce the manual effort and domain expertise traditionally required for text mining tasks.

Do persistent agents really cost less per token?

When AI agents reuse cached context across tasks, does the standard cost-per-token metric still reveal true economic efficiency? A case study suggests the answer may be no.

Should we evaluate deployed agents as whole environments instead?

Conventional LLM evaluation focuses on models or individual episodes, but what if the right measurement unit is the entire coupled human-agent system including memory, tools, and protocols observed over time?

What collaboration level do workers actually want with AI?

Explores whether workers prefer full automation, equal partnership, or continuous human control across different tasks. Understanding worker preferences could reshape how organizations deploy AI systems.