HonestBait: Forward References for Attractive but Faithful Headline Generation
Current methods for generating attractive headlines often learn directly from data, which bases attractiveness on the number of user clicks and views. Although clicks or views do reflect user interest, they can fail to reveal how much interest is raised by the writing style and how much is due to the event or topic itself. Also, such approaches can lead to harmful inventions by over-exaggerating the content, aggravating the spread of false information. In this work, we propose HonestBait, a novel framework for solving these issues from another aspect: generating headlines using forward references (FRs), a writing technique often used for clickbait. A self-verification process is included during training to avoid spurious inventions. We begin with a preliminary user study to understand how FRs affect user interest, after which we present PANCO1, an innovative dataset containing pairs of fake news with verified news for attractive but faithful news headline generation. Automatic metrics and human evaluations show that our framework yields more attractive results (+11.25% compared to human-written verified news headlines) while maintaining high veracity, which helps promote real information to fight against fake news.
Introduction. Fake news has become a medium by which to spread misinformation (Oshikawa et al., 2020; Vicario et al., 2019). One common way to fight against fake news is to release verified news.2 However, as the goal of news verification is to correct misinformation, verified news headlines are often bland, making it difficult to gain the attention of users, which works against the need to alleviate the harmful impact of fake news. Therefore, headlines for verified news articles should be rewritten to be more intriguing but still faithful, which is expected to pique reader interest in verified news. Many studies have been conducted on generating attractive headlines (Jin et al., 2020; Xu et al., 2019), among which clickbait represents the style that generates the most reads or clicks. Despite their success in attracting readers, there are several challenges in current models.
Discussion / Conclusion. We present HonestBait, a novel framework for generating faithful but interesting headlines from a new aspect: forward references. Moreover, we construct PANCO, a novel dataset that includes the title and content of pairs of fake and verified news, along with their forward reference types for further research. Our user study shows that verified news headlines are relatively boring, and forward references are used in most headlines liked by readers. Experimental results show that HonestBait outperforms all baselines in both automatic and human evaluations, which demonstrates its effectiveness in generating attractive but faithful headlines. We expect HonestBait to help rewrite monotonous real news headlines to increase their exposure rate to help combat fake news. Although HonestBait shows promising results for generating attractive but faithful headlines, there are still some limitations: (1) HonestBait is a monolingual model that only supports Chinese. It requires three pre-trained scorers. Also, as the FR labels are specifically difficult to obtain, it is not easy to implement in other languages.