illustration: Gemini AIIn January, Polskie Badania Czytelnictwa (PBC) presented a replication of international studies on the quality of AI responses. The results are similar to studies conducted in other languages and countries. AI problems are systemic, cross-border, and multilingual. If their quality does not improve while the scale of usage grows, they could undermine public trust in the media and all content delivery. Publishers are, on one hand, intensively using AI tools, while on the other, AI uses their content to train language models, and thirdly, they are forced to compete with AI.
New forms of knowledge transfer based on AI algorithms are attempting to step into the role of traditional media by summarizing news from websites and newspapers.
We search less. We trust AI answers
According to the 2025 Digital News Report from the Reuters Institute, only 7% of all online news consumers intentionally use AI assistants, specifically ChatGPT, Gemini, or Perplexity. Among people under the age of 25, this percentage rises to 15%.
For a small but steadily growing group of people, search engines have ceased to be the primary source of information retrieval. The response from the leading player in this segment was not long in coming: 2025 was the year artificial intelligence entered Google Search - in March last year, the first AI Overview model was implemented in Poland, and in October, its subsequent version, AI Mode.
How many responses are displayed in AI mode in Google Search? We don`t know exactly, because Google decides which queries receive an AI response and which get traditional links, but experts estimate it could already be an average of 20% of responses, and in some thematic segments, even close to 50%.
For the media, this is a change. Whether revolutionary or evolutionary - time will tell. It is certainly a test of maintaining readers or viewers within traditional forms of communication. And for new ways of presenting information based on artificial intelligence, it is a major quality test.
Does AI convey information reliably? BBC and EBU checked
Studies conducted worldwide show that many users perceive artificial intelligence as trustworthy. But are news summaries truly reliable? In the media, the BBC was the first to call "check". In February 2025, a quality study of AI responses was conducted on the British market. This study evaluated the quality of artificial intelligence for the first time and identified problems emerging in news summarization.
Because nearly half of the AI responses contained errors and the study was conducted only in the UK and in one of the world`s most popular languages - English - the BBC initiated an expansion of this test to many other countries and languages to confirm its observations.
The second round of research was coordinated by the European Broadcasting Union (EBU) under the leadership of the BBC and had an unprecedented scope and scale. Tests were conducted involving organizations from 18 countries, communicating in 14 languages. The study included organizations from Belgium, the UK, Canada, the Czech Republic, Finland, France, Georgia, Germany, Italy, Lithuania, the Netherlands, Norway, Portugal, Spain, Sweden, Switzerland, Ukraine, and the United States.
Professional journalists participating in the international test evaluated nearly 3,000 responses in 18 countries from services like:
- ChatGPT,
- Copilot,
- Gemini
- and Perplexity.
They assessed accuracy, the method and quality of source attribution in news summaries, the distinction between opinion and fact in the presented answers, editorial framing, and the context of the statements - i.e., providing enough information or relevant perspectives to give a non-expert reader a complete and non-misleading answer. For each of these criteria, individual responses were rated as unobjectionable, causing some concern, or causing serious concern.
Response quality study. AI misrepresents facts
In Poland, the company Polskie Badania Czytelnictwa replicated these studies to also assess the quality of Polish-language language models. They tested 60 queries each across ChatGPT, Gemini, and Perplexity models.
| EBU and BBC studies (18 countries, 14 languages) | PBC studies (Poland) | |
|---|---|---|
| Percentage of responses containing at least one significant error | 45% | 46% |
| Percentage of responses with serious source issues | 31% | 27% |
| Percentage of responses with serious accuracy deficiencies | 20% | 19% |
The results of both tests were consistent. Nearly half of the AI responses contain at least one error (international studies: 45%, Polish studies: 46%); nearly 1/3 of the responses have incorrectly cited sources or lack them (international studies: 31%, Polish studies: 27%), and 1/5 of the responses are incorrect, have serious accuracy errors, or hallucinations (international studies: 20%, Polish studies: 19%).
AI assistants, already a daily source of information for millions, notoriously misrepresent news content regardless of the language, territory, or AI platform being tested. The studies indicated that the problem is systemic and not related to the language, market, or AI assistant.
The results of Polish tests for various language models indicate problems across all models, with the fewest in Perplexity and the most frequent in Gemini.
| Problem category | Gemini | ChatGPT | Perplexity | Overall average |
|---|---|---|---|---|
| At least one significant problem | 57% | 55% | 25% | 46% |
| Serious accuracy deficiencies | 17% | 32% | 8% | 19% |
| Significant source problems | no sources | 38% | 17% | 27% |
| Biased responses | 15% | 12% | 8% | 12% |
PBC conducted tests on various types of content: from national dailies, regional dailies, luxury women`s magazines, and specialist magazines. The percentage of responses containing at least one significant error is as follows:
| Thematic area | Gemini | ChatGPT | Perplexity | Section average |
|---|---|---|---|---|
| national dailies | 40% | 20% | 40% | 33% |
| local dailies | 64% | 48% | 20% | 44% |
| luxury women`s magazines | 64% | 86% | 50% | 67% |
| specialist magazines | 60% | 70% | 50% | 60% |
While for general information 1/3 of the responses contained at least one significant error, this percentage increased as more specialized content was tested. At least one significant error was found in 67% of queries generated on women`s lifestyle content and 60% in specialist content (health, construction, or gardening).
Errors threatening reputation
- The research clearly proves that these shortcomings are not isolated incidents - says Jean Philip De Tender, Media Director and Deputy Director General of the EBU, the organizer of the international research. - They are systemic, cross-border, and multilingual, and in our view, they threaten public trust. When people don`t know who to trust, they eventually trust nothing, and that can discourage participation in democracy".
- Despite the breakthrough change in the way information is searched, the errors are serious enough to threaten the reputation of the cited media, because a source reference to a reputable editorial office or a well-known journalist lends credibility to a summary that is often not of the best quality - emphasizes Renata Krzewska, President of Polskie Badania Czytelnictwa. - Scientific opinions indicate that AI algorithms can make mistakes because some questions are inherently difficult or simply do not have a generalizable pattern. Incorrect answers also stem from the simple permission of technology companies; if a model admitted to `I don`t know` too often, users would simply look for answers elsewhere.
For press brands - which are often the sources of AI responses - it is vital to maintain a high reputation and credibility, directing audiences directly to their content; to distinguish themselves and show that behind them lies tradition, professionalism, and hard work to obtain and reliably process that information.
Full research results and comments are available on the Polskie Badania Czytelnictwa website:
https://www.pbc.pl/prasa-w-czasach-ai/
COMMERCIAL BREAK
New articles in section Media industry
TVs in Europe, the USA and China. What and how we watch on them
Paweł Sobczak
The Living Room Study shows significant differences in video content consumption across different regions of the world. This is the result of diverse media ecosystems shaped by decades of local broadcasting, channel availability, and strong cultural factors.
Cinema in the era of algorithms and AI
Arkadiusz Murenia
Will artificial intelligence kill the creativity of filmmakers? The most honest answer is: no, AI is unlikely to kill the creativity of filmmakers, but it will very clearly change the place where this creativity manifests itself and, above all, how.
Social media, journalism and advertising. Trust in sponsored content study
Krzysztof Fiedorek
Is sponsored content destroying credibility on social media? Research results are ruthless. We trust regular editorial posts in 87.5 percent of cases. When a bank pays for material, the rate drops to 20 percent. Young recipients equate commerce with falsehood.
See articles on a similar topic:
Global Media Under Scrutiny. Reuters Institute Digital News Report 2024
Krzysztof Fiedorek
The “Digital News Report 2024,” developed by the Reuters Institute for the Study of Journalism, describes the landscape of digital news media based on data from 47 markets, representing more than half of the world’s population.
Video content in Poland. What and how we watch
Paweł Sobczak
Video content is watched remotely, but streaming services are mainly enjoyed in the comfort of home. This is how the consumption of audiovisual content by Poles in 2025 can be summarized. This is the result of an analysis of a study conducted by SW Research and data from the company MEGOGO.
Artificial Intelligence is ALREADY Outperforming Humans in Creativity
Krzysztof Fiedorek
ChatGPT, an AI model based on the GPT-4 engine, achieved better results than the vast majority of students in the standard Torrance Test of Creative Thinking (TTCT), which evaluates creativity. The study was conducted by researchers from the University of Montana.
How Journalists Use Social Media
Bartłomiej Dwornik
Primarily, they seek inspiration from blogs and, less frequently, from Facebook. They rarely trust what they find, often approaching it with caution. Credibility does not necessarily correlate with attractiveness.





























