Intrinsically Motivated Graph Exploration Using Network Theories of Human Curiosity
Intrinsically motivated exploration has proven useful for reinforcement learning, even without additional extrinsic rewards. When the environment is naturally represented as a graph, how to guide exploration best remains an open question. In this work, we propose a novel approach for exploring graph-structured data motivated by two theories of human curiosity: the information gap theory and the compression progress theory. The theories view curiosity as an intrinsic motivation to optimize for topological features of subgraphs induced by the visited nodes in the environment. We use these proposed features as rewards for graph neural-networkbased reinforcement learning. On multiple classes of synthetically generated graphs, we find that trained agents generalize to larger environments and to longer exploratory walks than are seen during training. Our method computes more efficiently than the greedy evaluation of the relevant topological properties. The proposed intrinsic motivations bear particular relevance for recommender systems. We demonstrate that curiosity-based recommendations are more predictive of human behavior than PageRank centrality for several real-world graph datasets, including MovieLens, Amazon Books, and, Wikispeedia.
Introduction. Providing a task-agnostic incentive for exploration as an intrinsic reward has proven useful in a variety of reinforcement learning settings, even in the absence of any task-specific (extrinsic) rewards [1, 2]. Termed curiosity in reference to the analogous drive in humans, prior formulations are based on different means of quantifying the novelty or surprisal of states encountered by an agent [3]. If states are represented as graphs, the task-agnostic motivation to explore can additionally be contentagnostic, depending only on the topological properties of the visited state subgraph. Leading theories of curiosity in humans are similarly content-agnostic, based on structural properties of a relational graph that connects atoms of knowledge without regard to their actual content [4]. Theories of curiosity attempt to describe the intrinsic motivations that underlie human decisionmaking when acquiring information through exploration. The information gap theory (IGT) argues that curiosity collects knowledge that regulates gaps in our understanding of the world [5].
Discussion / Conclusion. We can use intrinsic motivations that underpin human curiosity to train neural networks to explore graph-structured environments with diverse topological structures. Our approach generalizes to longer exploratory walks and larger environments than are seen during training. Importantly, relying only on the structure of the visited subgraph and without any domain-specific node features, we find that our method is more predictive of human behavior than PageRank centrality for several real-world graph datasets.