Cognitive Models and Latent Representations

How do language models encode syntactic relations geometrically?

Do LLM embeddings use distance alone or also direction to represent syntax? Understanding whether neural networks can spontaneously develop symbolic-compatible geometric structures.

Can a single regularizer prevent JEPA representation collapse?

JEPAs traditionally need complex loss stacks and auxiliary tricks to avoid collapse. Can a single Gaussian-distribution constraint on latent embeddings do the same stabilization work, and would that simplify training?

Do autoencoders learn hidden attractors in latent space?

When you repeatedly apply an autoencoder's encode-decode cycle, do the trajectories in latent space converge to specific points? If so, what creates these attractors and what do they reveal about what the network learned?

Can communication pressure drive agents to learn shared abstractions?

Under what conditions do AI agents develop compact, efficient shared languages? This explores whether cooperative task pressure—rather than explicit optimization—naturally drives abstraction formation, mirroring human collaborative communication.

Can we probe foundation models without any input data?

Can we understand what foundation models have learned by sampling noise through their encode-decode dynamics instead of analyzing their response to real inputs? This matters for auditing models whose training data is proprietary or inaccessible.

Can continuous thoughts have tractable likelihoods for sampling and scoring?

Most latent-reasoning methods discard the likelihood and sampling properties that made textual chain-of-thought trainable. Can normalizing flows recover those affordances in continuous thought space while preserving efficiency?

Do larger models actually learn simpler functions?

Can we measure whether bigger neural networks discover simpler underlying functions despite having more parameters? This matters because it challenges the assumption that model size directly correlates with learned complexity.

Do language models learn differently from good versus bad outcomes?

Do LLMs update their beliefs asymmetrically when learning from their own choices versus observing others? This matters for understanding whether agentic AI systems might inherit human cognitive biases.

Why does asking models to think first hurt performance?

Initial prompts to generate internal thoughts degrade instruction-following performance. What reverses this harm, and can thinking become useful beyond math and logic?

Why does latent chain-of-thought fail so easily in training?

Explores why latent reasoning is fragile compared to textual chain-of-thought, focusing on how outcome-only supervision creates gradient starvation and representational drift in learned reasoning trajectories.

Can latent thought vectors scale language models beyond parameters?

Explores whether explicit latent thought vectors with dual-rate learning create new scaling dimensions independent of model size. This matters because it suggests alternatives to simply building larger models.

Can we decode what LLM activations really represent in language?

Can a trained decoder translate internal LLM activations into natural language descriptions, revealing what hidden representations actually encode? This matters because it could unlock both interpretability and controllability through the same mechanism.

Can language models learn to model human decision making?

Explores whether LLMs finetuned on psychological experiments can capture how people actually make decisions better than theories designed specifically for that purpose.

Do LLMs compress concepts more aggressively than humans do?

Do language models prioritize statistical compression over semantic nuance when forming conceptual representations, and how does this differ from human category formation? This matters because it may explain why LLMs fail at tasks requiring fine-grained distinctions.

Cognitive Models and Latent Representations

How do language models encode syntactic relations geometrically?

Can a single regularizer prevent JEPA representation collapse?

Do autoencoders learn hidden attractors in latent space?

Can communication pressure drive agents to learn shared abstractions?

Can we probe foundation models without any input data?

Can continuous thoughts have tractable likelihoods for sampling and scoring?

Do larger models actually learn simpler functions?

Do language models learn differently from good versus bad outcomes?

Why does asking models to think first hurt performance?

Why does latent chain-of-thought fail so easily in training?

Can latent thought vectors scale language models beyond parameters?

Can we decode what LLM activations really represent in language?

Can language models learn to model human decision making?

Do LLMs compress concepts more aggressively than humans do?

Do language models segment events like human consensus does?

Does conditioning LLMs on personal profiles improve prediction?

Can explicit stack tracking improve how transformers learn recursive syntax?

Can we measure what a model truly learned?

Can we explore multiple reasoning paths without committing to one token?

Can agents share thoughts directly without using language?

Do transformers hide reasoning before producing filler tokens?