Quantifying Controversy on Social Media

Paper · arXiv 1507.05224 · Published July 18, 2015
Social Media and AISentiment, Semantics, and Toxicity Detection

Which topics spark the most heated debates on social media? Identifying those topics is not only interesting from a societal point of view, but also allows the filtering and aggregation of social media content for disseminating news stories. In this paper, we perform a systematic methodological study of controversy detection by using the content and the network structure of social media. Unlike previous work, rather than study controversy in a single hand-picked topic and use domain-specific knowledge, we take a general approach to study topics in any domain. Our approach to quantifying controversy is based on a graph-based three-stage pipeline, which involves (i) building a conversation graph about a topic; (ii) partitioning the conversation graph to identify potential sides of the controversy; and (iii) measuring the amount of controversy from characteristics of the graph. We perform an extensive comparison of controversy measures, different graph-building approaches, and data sources. We use both controversial and non-controversial topics on Twitter, as well as other external datasets. We find that our new random-walk-based measure outperforms existing ones in capturing the intuitive notion of controversy, and show that content features are vastly less helpful in this task.

Introduction. Given their widespread diffusion, online social media have become increasingly important in the study of social phenomena such as peer influence, framing, bias, and controversy. Ultimately, we would like to understand how users perceive the world through the lens of their social media feed. However, before addressing these advanced application scenarios, we first need to focus on the fundamental yet challenging task of distinguishing whether a topic of discussion is controversial. Our work is motivated by interest in observing controversies at societal level, monitoring their evolution, and possibly understanding which issues become controversial and why. The study of controversy in social media is not new; there are many previous studies aimed at identifying and characterizing controversial issues, mostly around political debates [1, 10, 39, 40] but also for other topics [27]. And while most recent papers have focused on Twitter [10, 27, 39, 40], controversy in other platforms, such as blogs [1] and opinion fora [2], has also been analyzed.

Discussion / Conclusion. The task we tackle in this work is certainly not an easy one, and this study has some limitations, which we discuss in this section. We also report a set of negative results that we produced while coming up with the measures presented. We believe these results will be very useful in steering this research topic towards a fruitful direction. Table 4 provides a summary of the various graph building strategies and controversy measures we tried for quantifying controversy. 10.3 Conclusions In this paper, we performed the first large-scale systematic study for quantifying controversy in social media. We have shown that previously-used measures are not reliable and demonstrated that controversy can be identified both in the retweet and topic-induced follow graph.