Want the News Summarized Accurately? Don't Ask an "AI".
The Beeb decided to test some LLMs to see how well they could summarize the news https://www.bbc.com/news/articles/c0m17d8827ko Turns out the answer is, "not very well".
In the study, the BBC asked ChatGPT, Copilot, Gemini and Perplexity to summarise 100 news stories and rated each answer. It got journalists who were relevant experts in the subject of the article to rate the quality of answers from the AI assistants. It found 51% of all AI answers to questions about the news were judged to have significant issues of some form. Additionally, 19% of AI answers which cited BBC content introduced factual errors, such as incorrect factual statements, numbers and dates.
[...] In her blog, Ms Turness said the BBC was seeking to "open up a new conversation with AI tech providers" so we can "work together in partnership to find solutions".
She called on the tech companies to "pull back" their AI news summaries, as Apple did after complaints from the BBC that Apple Intelligence was misrepresenting news stories.
Some examples of inaccuracies found by the BBC included:
- Gemini incorrectly said the NHS did not recommend vaping as an aid to quit smoking
- ChatGPT and Copilot said Rishi Sunak and Nicola Sturgeon were still in office even after they had left
- Perplexity misquoted BBC News in a story about the Middle East, saying Iran initially showed "restraint" and described Israel's actions as "aggressive"
In general, Microsoft's Copilot and Google's Gemini had more significant issues than OpenAI's ChatGPT and Perplexity, which counts Jeff Bezos as one of its investors. Normally, the BBC blocks its content from AI chatbots, but it opened its website up for the duration of the tests in December 2024. The report said that as well as containing factual inaccuracies, the chatbots "struggled to differentiate between opinion and fact, editorialised, and often failed to include essential context."
Normally I'd add a snide remark, but I don't think I need to this time...
Read more of this story at SoylentNews.