The emergence of AI companion applications has created novel forms of intimate human-AI relationships, yet empirical research on these communities remains limited. We present the first large-scale com…
Recent improvements in large language models (LLMs) have led many researchers to focus on building fully autonomous AI agents. This position paper questions whether this approach is the right path for…
Abstract Different people have different perceptions about artificial intelligence (AI). It is extremely important to bring together all the alternative frames of thinking—from the various communities…
Self-improvement is a goal currently exciting the field of AI, but is fraught with danger, and may take time to fully achieve. We advocate that a more achievable and better goal for humanity is to max…
PAOLO MONTI Università degli Studi di Milano Bicocca Dipartimento di Scienze Umane per la Formazione “Riccardo Massa” paolo.monti@unimib.it ABSTRACT Large Language Models (LLMs) are generative AI syst…
By its nature, intelligence is high-dimensional and relational, not a single quantity that must be unambiguously less or greater than human scale. In fact, it is unclear what we even mean by “human sc…
By exploring past incarnations of agents, we can understand what has been done previously, what worked, and more importantly, what did not pan out and why. This understanding lets us to examine what d…
Dishonesty is far from a new phenomenon. But as chatbots, online forms, and other digital interfaces grow more and more common across a wide range of customer service applications, bending the truth t…
We present an approach for automatically generating and testing, in silico, social scientific hypotheses. This automation is made possible by recent advances in large language models (LLM), but the ke…
Benchmarks like Massive Multitask Language Understanding (MMLU) have played a pivotal role in evaluating AI’s knowledge and abilities across diverse domains. However, existing benchmarks predominantly…
The dominant practice of AI alignment assumes (1) that preferences are an adequate representation of human values, (2) that human rationality can be understood in terms of maximizing the satisfaction …
Large language models (LLMs) are capable of successfully performing many language processing tasks zero-shot (without training data). If zero-shot LLMs can also reliably classify and explain social ph…
Abstract Large language models such as ChatGPT enable users to automatically produce text but also raise ethical concerns, for example about authorship and deception. This paper analyses and discusses…
By and large, current scholarship examining ChatGPT and generative AI shows a strong anthropocentric motivation or a human–institutional focus. Many studies look at the structural impact of the techno…
We argue that the language modeling task, because it only uses form as training data, cannot in principle lead to learning of meaning. We take the term language model to refer to any system trained on…
Large language models (LLMs) have significantly advanced the field of artificial intelligence. Yet, evaluating them comprehensively remains challenging. We argue that this is partly due to the predomi…
Chain-of-Thought (CoT) prompting helps models think step by step. But what happens when they must see, understand, and judge—all at once? In visual tasks grounded in social context, where bridging per…
Large language models (LLMs) provide a compelling foundation for building generally-capable AI agents. These agents may soon be deployed at scale in the real world, representing the interests of indiv…
In today’s world of fast-growing technology and an inexhaustible amount of data, there is a great need to control and verify data validity due to the possibility of fraud. Therefore, the need for a re…
Understanding how users perceive content from generative AI tools is crucial because it can help reduce unwarranted trust in inaccurate information and mitigate the spread of misinformation. A focus g…
Large language models (LLMs) exhibit compelling linguistic behaviour, and sometimes offer self-reports, that is to say statements about their own nature, inner workings, or behaviour. In humans, such …
Addressing collective issues in social development requires a high level of social cohesion, characterized by cooperation and close social connections. However, social cohesion is challenged by selfis…
What do real conversations with Claude tell us about the effects of AI on labor productivity? Using our privacy-preserving analysis method, we sample one hundred thousand real conversations from Claud…
 As AI-powered systems increasingly mediate consequential decision-making, their explainability is critical for end-user…
Abstract The responsibility gap, commonly described as a core challenge for the effective governance of, and trust in, AI and autonomous systems (AI/AS), is traditionally associated with a failure of …
method leverages the inherent vulnerabilities of LLMs in handling world knowledge, which can be exploited by attackers to unconsciously spread fabricated information. Through extensive experiments, we…
Autonomous agents are moving from tools into a layer of social infrastructure: they browse, purchase, deploy software, manage systems, and increasingly interact with one another. As these systems scal…
Large Language Models (LLMs), in the recent years, have become more sophisticated and capable for them to be applicable in many situations and tasks. These tasks are not limited to information extract…
Our framework features an audio-enhanced mini-interview to capture nuanced worker desires and introduces the HumanAgency Scale (HAS) as a shared language to quantify the preferred level of human invol…
In many cases, people will not interact directly with AI systems but instead read conversations between AI systems and other people. We measured how well people and large language models can discrimin…
This paper examines the systemic risks posed by incremental advancements in artificial intelligence, developing the concept of ‘gradual disempowerment’, in contrast to the abrupt takeover scenarios co…
Abstract: There is much discussion of the false outputs that generative AI systems such as ChatGPT, Claude, Gemini, DeepSeek, and Grok create. In popular terminology, these have been dubbed AI halluci…
AI assistance produces significant productivity gains across professional domains, particularly for novice workers. Yet how this assistance affects the development of skills required to effectively su…
But how compelling are these AI-generated ideas, and how can we improve their quality? Here, we introduce SciMuse, which uses 58 million research papers and a large-language model to generate research…
Recent advances in large language models (LLM) have enabled richer social simulations, allowing for the study of various social phenomena. However, most recent work has used a more omniscient perspect…
[No public URL — single-author preprint by Valerio Capraro] [[Psychology Chatbots Conversation]] [[Social Theory Society]] [[Cognitive Models Latent]] LLMorphism is the biased belief that human cog…
This paper examines some limitations of large language models (LLMs) through the framework of Peircean semiotics. We argue that basic LLMs exist within a "hall of mirrors," manipulating symbols withou…
In this paper, we uncover notable diversity in the ideological stance exhibited across different LLMs and languages in which they are accessed. We do this by prompting a diverse panel of popular LLMs …
Large language models sometimes produce structured, first-person descriptions that explicitly reference awareness or subjective experience. To better understand this behavior, we investigate one theor…
Newly-developed large language models (LLM)—because of how they are trained and designed—are implicit computational models of humans—a homo silicus. LLMs can be used like economists use homo economicu…
Understanding Theory of Mind is essential for building socially intelligent multimodal agents capable of perceiving and interpreting human behavior. We introduce MOMENTS (Multimodal Mental States), a …
Digital platforms increasingly use online behavioral targeting (OBT) to enhance consumers’ engagement, which involves using algorithms to “gaze” at consumers—tracking their online activities and infer…
This study focused on three main research objectives: analyzing the methods used to identify deceptive online consumer reviews, evaluating insights provided by multi-method automated approaches based …
Abstract. Artificial intelligence (AI) is the name popularly given to a broad spectrum of computer tools designed to perform increasingly complex cognitive tasks, including many that used to solely be…
RLHF assumes that annotation responses reflect genuine human preferences. We argue this assumption warrants systematic examination, and that behavioral science offers frameworks that bring clarity to …
We address this gap by analyzing data from the AI Search Arena, a head-to-head evaluation platform for AI search systems. The dataset comprises over 24,000 conversations and 65,000 responses from mode…
This report outlines several case studies on how actors have misused our models, as well as the steps we have taken to detect and counter such misuse. By sharing these insights, we hope to protect the…
Synthesizing unstructured research materials into manuscripts is an essential yet under-explored challenge in AI-driven scientific discovery. Existing autonomous writers are rigidly coupled to specifi…
We evaluated 3 systems (ELIZA, GPT-3.5 and GPT-4) in a randomized, controlled, and preregistered Turing test. Human participants had a 5 minute conversation with either a human or an AI, and judged wh…
 **** In our environment, agents role-play and interact under a wide variety of scenarios; they coordinate, collaborate, exchange, and comp…
AI systems are increasingly designed in ways that lead users to perceive them as conscious. This paper provides a unified framework connecting empirical hallmarks of consciousness attribution to a str…
We consider situations where a user feeds her attributes to a machine learning method that tries to predict her best option based on a random sample of other users. The predictor is incentive-compatib…
Simulating society with large language models (LLMs), we argue, requires more than generating plausible behavior; it demands cognitively grounded reasoning that is structured, revisable, and traceable…
People rely on social skills like conflict resolution to communicate effectively and to thrive in both work and personal life. However, practice environments for social skills are typically out of rea…
Large language models (LLMs) encapsulate vast amounts of knowledge but still remain vulnerable to external misinformation. Existing research mainly studied this susceptibility behavior in a single-tur…
The proliferation of AI-generated and AI-assisted text on the internet is feared to contribute to a degradation in semantic and stylistic diversity, factual accuracy, and other negative developments (…
As Bainbridge [7] noted, a key irony of automation is that by mechanising routine tasks and leaving exception-handling to the human user, you deprive the user of the routine opportunities to practice …
The rapid integration of large language models (LLMs) into everyday workflows has transformed how individuals perform cognitive tasks such as writing, programming, analysis, and multilingual communica…
We outline some common methodological issues in the field of critical AI studies, including a tendency to overestimate the explanatory power of individual samples (the benchmark casuistry), a dependen…
In this paper, we contend that the designers and final users of these ML methods have forgotten a fundamental lesson from statistics: correlation does not imply causation. Not only do most state-of-th…
Generative artificial intelligence (AI) has the potential to both exacerbate and ameliorate existing socioeconomic inequalities. In this article, we provide a state-of-the-art interdisciplinary overvi…
Abstract— Conversational Swarm Intelligence (CSI) is a new technology that enables human groups of potentially any size to hold real-time deliberative conversations online. Modeled on the dynamics of …
Consumers of services and products exhibit a wide range of behaviors on social networks when they are dissatisfied. In this paper, we consider three types of cynical expressions – negative feelings, s…
Lying appears in everyday oral and written communication. As a consequence, detecting it on the basis of linguistic analysis is particularly important. Our study aimed to verify whether the difference…
We introduce a new type of test, called a Turing Experiment (TE), for evaluating to what extent a given language model, such as GPT models, can simulate different aspects of human behavior. A TE can a…
When producing deceptive narratives, liars employ verbal strategies to create false beliefs in the interacting partners and are thus involved in a specific and temporary psychological and emotional st…
This paper argues that generative AI should be understood not as a mimicry of human cognition, but as a form of alternative intelligence and alternative creativity, operating through distinct mechanis…
This chapter explores theoretically the long-run implications of Artificial General Intelligence (AGI) for economic growth and labor markets. AGI makes it feasible to perform all economically valuable…
I’ll begin by defining intelligence and AGI. There are a number of positions [6, 2, 7–12]. Some peg AGI to human-level performance across a broad range of tasks [13, 1]. This is is intuitive, but anth…
Introduction. Who’s Afraid of (Left) Hyperstitions? By Armen Avanessian and Anke Hennig Introduction The word “hyperstition” is a conflation of hype and superstition. Hyperstitions are fictions that c…
In this work, we take a step toward that goal by analyzing the work activities people do with AI, how successfully and broadly those activities are done, and combine that with data on what occupations…
limitations. This study focuses on finding out the cognitive cost of using an LLM in the educational context of writing an essay. We assigned participants to three groups: LLM group, Search Engine gr…
Increasing number of researchers and designers are envisioning a wide range of novel proactive conversational services for smart speakers such as context-aware reminders and restocking household items…