SYNTHESIS NOTE
Psychology, Society, and Alignment

Can LLMs actually conduct Socratic questioning in therapy?

While LLMs can generate individual therapy skills like assessment and psychoeducation, it remains unclear whether they can execute the adaptive, turn-based Socratic questioning needed to produce real cognitive change in patients.

Synthesis note · 2026-02-23 · sourced from Psychology Therapy Practice
What makes therapeutic chatbots actually work in clinical practice?

For certain use cases, LLMs show promising ability to conduct individual tasks needed for psychotherapy — assessment, psychoeducation, demonstrating interventions. But clinical products and prototypes have not demonstrated anywhere near the sophistication required to take the place of psychotherapy. The gap is specific: while an LLM can generate an alternative belief in the style of CBT, it remains unproven whether it can engage in the turn-based, Socratic questioning that would be expected to produce cognitive change.

This distinction — between exhibiting a skill and implementing it therapeutically — is the core challenge. Generating an alternative belief is a single-turn text generation task. Socratic questioning requires tracking the patient's cognitive state across turns, calibrating the timing and intensity of challenges, adapting when the patient resists or deflects, and knowing when to push versus when to support. This is a multi-turn planning problem with a moving target (the patient's evolving understanding).

Since Can language models understand without actually executing correctly?, the therapy skill gap may be an instance of this broader pattern. LLMs can comprehend what Socratic questioning looks like (they can describe it, generate examples) but cannot competently execute it in live interaction. Psychotherapy transcripts are likely poorly represented in training data, and privacy/ethical concerns make such representation challenging. Prompt engineering may be the most feasible approach, but it cannot substitute for the adaptive multi-turn reasoning that therapy demands.

The five major challenges for LLM mental health deployment — hallucination, interpretability, bias, privacy, and clinical methodology — compound this gap. Intrinsic hallucination (contradicting dialogue history) directly undermines the internal consistency essential to therapeutic trust. The inability to process nonverbal cues removes a critical information channel. And the tendency to be overly prescriptive clashes with current evidence-based practice, which favors exploratory over directive approaches.

Inquiring lines that use this note as a source 4

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 5

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
15 direct connections · 89 in 2-hop network ·medium cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

the gap between simulating therapy skills and implementing them therapeutically remains unresolved — LLMs can generate CBT-style beliefs but cannot conduct Socratic questioning