Using Large Language Models to Generate, Validate, and Apply User Intent Taxonomies

Paper · arXiv 2309.13063 · Published September 14, 2023

Log data can reveal valuable information about how users interact with Web search services, what they want, and how satisfied they are. However, analyzing user intents in log data is not easy, especially for emerging forms of Web search such as AI-driven chat. To understand user intents from log data, we need a way to label them with meaningful categories that capture their diversity and dynamics. Existing methods rely on manual or machine-learned labeling, which are either expensive or inflexible for large and dynamic datasets. We propose a novel solution using large language models (LLMs), which can generate rich and relevant concepts, descriptions, and examples for user intents. However, using LLMs to generate a user intent taxonomy and apply it for log analysis can be problematic for two main reasons: (1) such a taxonomy is not externally validated; and (2) there may be an undesirable feedback loop. To address this, we propose a new methodology with human experts and assessors to verify the quality of the LLM-generated taxonomy. We also present an end-to-end pipeline that uses an LLM with human-in-the-loop to produce, refine, and apply labels for user intent analysis in log data.

Introduction. Understanding the purpose or the task behind a user’s request in an information access context is highly desired for a search or a recommender system to be able to provide the most relevant and meaningful results [59]. However, extracting user intents from log data is extremely difficult due to two main reasons: (1) fluidity in what user intents are or can be; and (2) how these intents can be identified using log data that may not include sufficient context. Additionally, in the case of emerging modalities such as AI-driven chat, users’ understanding, usage, and behaviors are rapidly evolving that call for on-demand, task-focused labels and taxonomies. We need new methods to identify, extract, and apply user intents in IR systems, especially those with emerging modalities. Traditional qualitative methods such as coding and thematic analysis are time-consuming and require human expertise [10]. Conversely, existing quantitative methods may not capture the nuances and diversity of user intents and experiences [38].

Discussion / Conclusion. Identifying user intents in online information access is highly crucial for most search and recommender systems. But doing so is often very challenging. Even if one has a pre-defined taxonomy of user intents, training an ML model or using such a model to annotate rapidly changing behavioral traits in new modalities such as AI chat can be expensive or infeasible. LLMs are shown to be effective at extracting concepts, descriptions or summaries, and examples from given set of text. This could be used for building and using taxonomies containing user intents, but there is a danger of creating a feedback loop without a clear evaluation. In this paper we presented a novel methodology for using LLMs in generating, validating, and using taxonomies for identifying user intents in various applications. The methodology was demonstrated using an application of understanding user intents in AI chat logs. A case study was then presented with the application of contrasting user intents between search and chat.

Using Large Language Models to Generate, Validate, and Apply User Intent Taxonomies

Synthesis notes that discuss concepts related to this paper