Using Topic Models to Identify Clients’ Functioning Levels and Alliance Ruptures in Psychotherapy

Paper · Source

Computerized Natural Language Processing techniques can analyze psychotherapy sessions as texts; thus generating information about the therapy process and outcome and supporting the scaling-up of psychotherapy research. We used topic modeling to identify topics discussed in psychotherapy sessions and explored (1) which topics best identified clients’ functioning and alliance ruptures and (2) whether changes in these topics were associated with changes in outcome. Transcripts of 873 sessions from 58 clients treated by 52 therapists were analyzed. Prior to each session, clients self-reported functioning and symptom level. After each session, therapists reported the extent of alliance rupture. Latent Dirichlet Allocation was used to extract latent topics from psychotherapy textual data. Then a Sparse Multinomial Logistic Regression model was used to predict which topics best identified clients’ functioning levels and the occurrence of alliance ruptures in psychotherapy sessions. Finally, we used multi-level growth models to explore the associations between changes in topics and changes in outcome. Session- based processing yielded a list of semantic topics. The model identified the labels above chance (65%-75% accuracy).

Introduction. Psychotherapy is based to a great extent on the content of exchanges between clients and therapists, which conveys important information about the participants’ modes of communication, mental states, thoughts, and feelings. Until recently, most psychotherapy research has relied on self-report measures or on human coders to quantify the information in psychotherapy sessions. These standardized subjective measures are the building blocks of psychotherapy research, and the process and outcome of treatment cannot be studied without them. However, these methods also have critical shortcomings, including the extent of participants’ self-insights, their willingness to complete questionnaires, and their restricted choice of responses (for a review of the limitations of current research methods, see Kazdin, 2016). Furthermore, observational human coding is very labor-intensive, which limits the amount of data that can be analyzed and thus curtails the generalizability of results (Hill & Lambert, 2004).

Discussion / Conclusion. Advanced machine learning techniques are relatively novel in psychotherapy research, but emerging evidence suggests the value of integrating them into traditional measures commonly applied to therapy (Dwyer, Falkai, & Koutsouleris, 2018). We used topic modeling, a data-driven machine learning technique that extracts latent topics from textual data to examine which topics best identify clients’ functioning and alliance ruptures in psychotherapy sessions, and whether changes in these topics were associated with changes in treatment outcome. Topic modeling yielded semantically meaningful topics that were then used to identify session level clients’ functioning and rupture. Consistent with our first hypothesis, the SMLR models with topic models features identified labels above chance, at 65% (alliance ruptures) to 75% (clients’ functioning) test accuracy.

Lines of inquiry this paper opens 24

Research framings built by reading the notes related to this paper — the questions it feeds into.

How do transformer attention mechanisms implement memory and algorithmic functions?

What does it mean to truly attend to someone in conversation?

How can real-time alliance measurement improve therapy outcomes?

How do evaluation biases undermine LLM quality assessment systems?

How does automated transcript analysis compare to patient self-report on engagement?

How does AI assistance affect human cognitive development and reasoning autonomy?

What role does cognitive reappraisal play in disclosure benefits?

Can AI systems balance emotional competence with factual reliability?

How does action-based validation differ from verbal empathy in preventing unhealthy attachment?

Is embodied interaction necessary for language meaning and genuine agency?

Why does shared practice matter for meaning to take hold?

Why do LLM chatbots fail as independent therapeutic agents?

What factors beyond surface content determine how readers extract meaning differently?

What makes a positive reframing feel authentic rather than dismissive?

What makes dialogue-based explanation more successful than monologue?

What role do first-person pronouns play in sustaining collaborative conversation tone?

Using Topic Models to Identify Clients’ Functioning Levels and Alliance Ruptures in Psychotherapy

Synthesis notes that discuss concepts related to this paper 2

Lines of inquiry this paper opens 24