HiTKG: Towards Goal-Oriented Conversations via Multi-Hierarchy Learning

Paper · Source

A screenshot of a diagram

Human conversations are guided by short-term and longterm goals. We study how to plan short-term goal sequences as coherently as humans do and naturally direct them to an assigned long-term goal in open-domain conversations. Goal sequences are a series of knowledge graph (KG) entityrelation connections generated by KG walkers that traverse through the KG. The existing recurrent and graph attention based KG walkers either insufficiently utilize the conversation states or lack global guidance. In our work, a hierarchical model learns goal planning in a hierarchical learning framework. We present HiTKG, a hierarchical transformer-based graph walker that leverages multiscale inputs to make precise and flexible predictions on KG paths. Furthermore, we propose a two-hierarchy learning framework that employs two stages to learn both turn-level (short-term) and global-level (long-term) conversation goals. Specifically, at the first stage, HiTKG is trained in a supervised fashion to learn how to plan turn-level goal sequences; at the second stage, HiTKG tries to naturally approach the assigned global goal via reinforcement learning. In addition, we propose MetaPath as the backbone method for KG path representation to exploit the entity and relation information concurrently.

Introduction. Building a human-like dialogue system has been a longlasting goal in the community of conversational AI (Ni et al. 2021; Ma et al. 2020). In the pursuit of this goal, multiple research topics have emerged: context awareness (Qiu et al. 2020), response coherence (Liu et al. 2019a) and diversity (Su et al. 2020), speaker consistency (Madotto et al. 2019), empathetic response (Song et al. 2019), conversation topic (Wu et al. 2019), knowledge-grounded system (Chen et al. 2020), etc. Conversation goal is one of the most representative elements that reflect human intelligence. Human conversations are usually guided by several small goals or a global goal. As shown in Fig. 1, Grilled Fish, Chinese Dish, China Town, and Cinema are turn-level goals, while the Cinema is also the global goal at the same time. During the conversation, the agent intends to approach the global goal by naturally transitioning between turn-level goals. However, most dialogue systems passively respond to the user without explicit goals, causing incoherent or illogical responses.

Discussion / Conclusion. We propose HiTKG, a hierarchical transformer based KG walker that leverages multiscale inputs for graph reasoning in dialogues. HiTKG first learns to plan natural turn-level goals and then learns to approach a global goal. Both automatic and human evaluation illustrate the effectiveness of our method. In the future, we will investigate how to improve the embedding, learning framework, and evaluation criteria of stage 2 to further extend this topic.

Lines of inquiry this paper opens 24

Research framings built by reading the notes related to this paper — the questions it feeds into.

Why do agents confidently report success despite actually failing tasks?

Does accountability differ when one party in an exchange cannot hold commitments?

How should conversational agents balance goal-driven initiative with user control?

What dialogue dynamics distinguish negotiation from standard information-provision tasks?

How should dialogue recommender systems manage conversation history and state?

How should dialogue state tracking change when user preferences shift mid-conversation?

Why do language models reinforce false assumptions instead of correcting them?

How should dialogue systems represent uncertainty from noisy speech input?

How can language models sustain linguistic synchrony and intersubjectivity during dialogue?

Can AI ever lead conversations without the anticipatory presence sustained attention provides?

How does AI-generated content transformation affect public discourse quality?

How does AI lose correct information under conversational persuasive pressure?

Why do multi-turn conversations degrade AI intent and coherence?

How do training priors constrain what context information can override?

Can next-token prediction alone produce genuine language understanding?

How does the silent token approach compare to modeling intrinsic motivation for speaking?

Why can't humans reliably detect AI-generated text despite measurable linguistic signatures?

Can AI detect sense-of-nonsense the way human readers do?

How should models express uncertainty rather than forced confident answers?

Does uncertainty quantification in model responses reduce persuasive impact on audiences?

Can model confidence signals reliably improve reasoning quality and calibration?

Do verbal uncertainty estimates calibrate better than confidence scores for personalization?

How can persona representations reduce language model variance and improve task accuracy?

Why does model uncertainty dominate persona-specific knowledge in annotation tasks?

How do we evaluate AI systems when user perception misleads actual performance?

Can systems recognize and abstain on judgments rather than hallucinating preferences?

What properties determine whether reward signals teach genuine reasoning?

Why does combining natural language with numerical scores improve prediction accuracy?

How can models identify insufficient information and respond appropriately without guessing?

How do models signal knowledge gaps through token probability?

Why does self-revision increase model confidence while degrading accuracy?

Can single models correct their own beliefs without amplifying confidence in wrong answers?

HiTKG: Towards Goal-Oriented Conversations via Multi-Hierarchy Learning

Synthesis notes from this paper's topics 8

Lines of inquiry this paper opens 24