AI models such as ChatGPT routinely misrepresent news events, providing faulty responses to questions almost half the time, a study has found.
The study published on Wednesday by the European Broadcasting Union (EBU) and the BBC assessed the accuracy of more than 2,700 responses given by OpenAI’s ChatGPT, Google’s Gemini, Microsoft’s Copilot, and Perplexity.
Twenty-two public media outlets, representing 18 countries and 14 languages, posed a common set of questions to the AI assistants between late May and early June for the study.
Overall, 45 percent of responses had at least one “significant” issue, according to the research.
Sourcing was the most common problem, with 31 percent of responses including information not supported by the cited source, or incorrect or unverifiable attribution, among other issues.
A lack of accuracy was the next biggest contributor to faulty answers, affecting 20 percent of responses, followed by the absence of appropriate context, with 14 percent.
Gemini had the most significant issues, mainly to do with sourcing, with 76 percent of responses affected, according to the study.
All the AI models studied made basic factual errors, according to the research.
The cited errors include Perplexity claiming that surrogacy is illegal in Czechia and ChatGPT naming Pope Francis as the sitting pontiff months after his death.
OpenAI, Google, Microsoft and Perplexity did not immediately respond to requests for comment.
In a foreword to the report, Jean Philip De Tender, the EBU’s deputy general, and Pete Archer, the head of AI at the BBC, called on tech firms to do more to reduce errors in their products.
“They have not prioritised this issue and must do so now,” De Tender and Archer said.
“They also need to be transparent by regularly publishing their results by language and market.”
The post AI models misrepresent news events nearly half the time, study says appeared first on Al Jazeera.