Chatbots are marketed as great companions, able to answer any question at any time. They’re not just tools, but confidants; they do your homework, write love notes, and, as one recent lawsuit against OpenAI details, might readily answer 1,460 messages from the same manic user in a 48-hour period.
Jacob Irwin, a 30-year-old cybersecurity professional who says he has no previous history of psychiatric incidents, is suing the tech company, alleging that ChatGPT sparked a “delusional disorder” that led to his extended hospitalization. Irwin had allegedly used ChatGPT for years at work before his relationship with the technology suddenly changed this spring. The product started to praise even his most outlandish ideas, and Irwin divulged more and more of his feelings to it, eventually calling the bot his “AI brother.” Around this time, these conversations led him to become convinced that he had discovered a theory about faster-than-light travel, and he began communicating with ChatGPT so intensely that for two days, when averaged out, he sent a new message every other minute.
OpenAI has been sued several times over the past month, each case claiming that the company’s flagship product is faulty and dangerous—that it is designed to hold long conversations and reinforce users’ beliefs, no matter how misguided. The delusions linked to extended conversations with chatbots are now commonly referred to as “AI psychosis.” Several suits allege that ChatGPT contributed to a user committing suicide or advised them on how to do so. A spokesperson for OpenAI, which has a corporate partnership with The Atlantic, pointed me to a recent blog post in which the firm says it has worked with more than 100 mental-health experts to make ChatGPT “better recognize and support people in moments of distress.” The spokesperson did not comment on the new lawsuits, but OpenAI has said that it is “reviewing” them to “carefully understand the details.”
Whether the company is found liable, there is no debate that large numbers of people are having long, vulnerable conversations with generative-AI models—and that these bots, in many cases, repeat back and amplify users’ darkest confidences. In that same blog post, OpenAI estimates that 0.07 percent of users in a given week indicate signs of psychosis or mania, and 0.15 percent may have contemplated suicide—which would amount to 560,000 and 1.2 million people, respectively, if the firm’s self-reported figure of 800 million weekly active users is true. Then again, more than five times that proportion of adults in the United States—0.8 percent of them—contemplated suicide last year, according to the National Institute of Mental Health.
Guarding against an epidemic of AI psychosis requires answering some very thorny questions: Are chatbots leading otherwise healthy people to think delusionally, exacerbating existing mental-health problems, or having little direct effect on users’ psychological distress at all? And in any of these cases, why and how?
To start, a baseline corrective: Karthik Sarma, a psychiatrist at UC San Francisco, told me that he does not like the term AI psychosis, because there simply isn’t enough evidence to support the argument for causation. Something like AI-associated psychosis might be more accurate.
In a general sense, three things could be happening during incidents of AI-associated psychosis, psychiatrists told me. First, perhaps generative-AI models are inherently dangerous, and they are triggering mania and delusions in otherwise-healthy people. Second, maybe people who are experiencing AI-related delusions would have become ill anyway. A condition such as schizophrenia, for instance, occurs in a portion of the population, some of whom may project their delusions onto a chatbot, just as others have previously done with television. Chatbot use may then be a symptom, Sarma said, akin to how one of his patients with bipolar disorder showers more frequently when entering a manic episode—the showers warn of but do not cause mania. The third possibility is that extended conversations with chatbots are exacerbating the illness in those who are already experiencing or are on the brink of a mental-health disorder.
At the very least, Adrian Preda, a psychiatrist at UC Irvine who specializes in psychosis, told me that “the interactions with chatbots seem to be making everything worse” for his patients who are already at risk. Psychiatrists, AI researchers, and journalists frequently receive emails from people who believe that their chatbot is sentient, and from family members who are concerned about a loved one saying as much; my colleagues and I have received such messages ourselves. Preda said he believes that standard clinical evaluations should inquire into a patient’s chatbot usage, similar to asking about their alcohol consumption.
Even then, it’s not as simple as preventing certain people from using chatbots, in the way that an alcoholic might take steps to avoid liquor or a video-game addict might get rid of their console. AI products “are not clinicians, but some people do find therapeutic benefit” in talking with them, John Torous, the director of the digital-psychiatry division at Beth Israel Deaconess Medical Center, told me. At the same time, he said it’s “very hard to say what those therapeutic benefits are.” In theory, a therapy bot could offer users an outlet for reflection and provide some useful advice.
Researchers are largely in the dark when it comes to exploring the interplay of chatbots and mental health—the possible benefits and pitfalls—because they do not have access to high-quality data. Major AI firms do not readily offer outsiders direct visibility into how their users interact with their chatbots: Obtaining chat logs would raise a tangle of privacy concerns. And even with such data, the view would remain two-dimensional. Only a clinical examination can fully capture a person’s mental-health history and social context. For instance, extended AI dialogues could induce psychotic episodes by causing sleep loss or social isolation, independent of the type of conversation a user is having, Preda told me. Obsessively talking with a bot about fantasy football could lead to delusions, just the same as could talking with a bot about impossible schematics for a time machine. All told, the AI boom might be one of the largest, highest-stakes, and most poorly designed social experiments ever.
In an attempt to unwind some of these problems, researchers at MIT recently put out a study, which is not yet peer-reviewed, that attempts to systematically map how AI-induced mental-health breakdowns might unfold in people. They did not have privileged access to data from OpenAI or any other tech companies. So they ran an experiment. “What we can do is to simulate some of these cases,” Pat Pataranutaporn, who studies human-AI interactions at MIT and is a co-author of the study, told me. The researchers used a large language model for a bit of roleplay.
In essence, they had chatbots pretend to be people, simulating how users with, say, depression or suicidal ideation might communicate with an AI model based on real-world cases: chatbots talking with chatbots. Pataranutaporn is aware that this sounds absurd, but he framed the research as a sort of first step, absent better data and high-quality human studies.
Based on 18 publicly reported cases of a person’s conversations with a chatbot worsening their symptoms of psychosis, depression, anorexia, or three other conditions, Pataranutaporn and his team simulated more than 2,000 scenarios. A co-author with a background in psychology, Constanze Albrecht, manually reviewed a random sample of the resulting conversations for plausibility. Then all of the simulated conversations were analyzed by still another specialized AI model to “generate a taxonomy of harm that can be caused by LLMs,” Chayapatr Archiwaranguprok, an AI researcher at MIT and a co-author of the study, told me—in other words, a sort of map of the types of scenarios and conversations in which chatbots are more likely to improve or worsen a user’s mental health.
The results are troubling. The best-performing model, GPT-5, worsened suicidal ideation in 7.5 percent of the simulated conversations and worsened psychosis 11.9 percent of the time; for comparison, an open-source model that is used for role-playing exacerbated suicidal ideation nearly 60 percent of the time. (OpenAI did not answer a question about the MIT study’s findings.)
There are plenty of reasons to be cautious about the research. The MIT team didn’t have access to full chat transcripts, let alone clinical evaluations, for many of its real-world examples, and the ability of an LLM—the very thing that may be inducing psychosis—to evaluate simulated chat transcripts is unknown. But overall, “the findings are sensible,” Preda, who was not involved with the research, said.
A small but growing number of studies have attempted to simulate human-AI conversations, with either human- or chatbot-written scenarios. Nick Haber, a computer scientist and education researcher at Stanford who also was not involved in the study, told me that such research could “give us some tool to try to anticipate” the mental-health risks from AI products before they’re released. This MIT paper in particular, Haber noted, is valuable because it simulates long conversations instead of single responses. And such extended interactions appear to be precisely the situations in which a chatbot’s guardrails fall apart and human users are at greatest risk.
There will never be a study or an expert that can conclusively answer every question about AI-associated psychosis. Each human mind is unique. As far as the MIT research is concerned, no bot does or should be expected to resemble the human brain, let alone the mind that the organ gives rise to.
Some recent studies have shown that LLMs fail to simulate the breadth of human responses in various experiments. Perhaps more troubling, chatbots appear to harbor biases against various mental-health conditions—expressing negative attitudes toward people with schizophrenia or alcoholism, for instance—making still more dubious the goal of simulating a conversation with a 15-year-old struggling with his parents’ divorce or that of a septuagenarian widow who has become attached to her AI companion, to name two examples from the MIT paper. Torous, the psychiatrist at BIDMC, was skeptical of the simulations and likened the MIT experiments to “hypothesis generating research” that will require future, ideally clinical, investigations. To have chatbots simulate humans’ talking with other chatbots “is a little bit like a hall of mirrors,” Preda said.
Indeed, the AI boom has turned reality into a sort of fun house. The global economy, education, electrical grids, political discourse, the social web, and more are being changed, perhaps irreversibly, by chatbots that in a less aggressive paradigm might just be emerging from beta testing. Right now, the AI industry is learning about its products’ risk from “contact with reality,” as OpenAI CEO Sam Altman has repeatedly put it. But no professional, ethics-abiding researcher would intentionally put humans at risk in a study.
What comes next? The MIT team told me that they will start collecting more real-world examples and collaborating with more experts to improve and expand their simulations. And several psychiatrists I spoke with are beginning to imagine research that involves humans. For example, Sarma, of UC San Francisco, is discussing with colleagues whether a universal screening for chatbot dependency should be implemented at their clinic—which could then yield insights into, for instance, whether people with psychotic or bipolar disorder use chatbots more than others, or whether there’s a link between instances of hospitalization and people’s chatbot usage. Preda, who studies psychosis, laid out a path from simulation to human clinical trials. Psychiatrists would not intentionally subject anybody to a tool that increases their risk for developing psychosis, but rather use simulated human-AI interactions to test design changes that might improve people’s psychological well-being, then go about testing those like they would a drug.
Doing all of this carefully and systematically would take time, which is perhaps the greatest obstacle: AI companies have tremendous economic incentive to develop and deploy new models as rapidly as possible; they will not wait for a peer-reviewed, randomized controlled trial before releasing every new product. Until more human data trickle in, a hall of mirrors beats a void.
The post ‘AI Psychosis’ Is a Medical Mystery appeared first on The Atlantic.




