Google Answers Correctly 9 Out of 10 Times. But With 5 Trillion Queries Per Year, the Remaining Errors Mean Tens of Millions of Mistakes Every Hour

A startup Oumi analysis for The New York Times found that Google's AI Overviews are 91% accurate — a record for precision that also represents an unprecedented scale of errors for search.

04/08/2026, 10:50 AM

335

Share:

RazomUA - Google Answers Correctly 9 Out of 10 Times. But With 5 Trillion Queries Per Year, the Remaining Errors Mean Tens of Millions of Mistakes Every Hour — Ілюстративне фото: Depositphotos

When Google launched AI Overviews in 2024, the company positioned it as a revolution in search. Now there are first independent data on the price of this revolution for the average user.

What the research showed

Startup Oumi, commissioned by The New York Times, tested 4,326 Google search queries using the industry benchmark SimpleQA — a standard tool for measuring the factual accuracy of AI systems. In October 2024, when AI Overviews were powered by Gemini 2, accuracy was 85%. After upgrading to Gemini 3 in February 2025, the figure rose to 91%.

The number looks convincing — until it's scaled up. Google processes over 5 trillion search queries per year. Even a 9% error rate means tens of millions of false answers per hour. This is not a hypothetical risk — it is the current state of a product used by hundreds of millions of people.

More accurate, but less verifiable

Alongside the improvement in accuracy, the research documented an opposite trend in the verifiability of answers. With Gemini 2, sources in 37% of correct answers either did not confirm the claim or were unrelated to it. With Gemini 3, this figure rose to 56% — meaning more than half of even correct answers cannot be verified through links provided by Google itself.

Examples from the research illustrate the mechanics of errors. When asked when Bob Marley's former home became a museum, AI Overviews confidently stated 1987 — although the correct year is 1986, and two of the three cited sources did not contain this date at all. The third source, Wikipedia, cited two contradictory figures, and the model chose the wrong one.

"AI responses may include mistakes"
— the standard Google disclaimer under each AI response, which, as the research showed, largely went unnoticed by users

Google's response: methodology in question

Google spokesperson Ned Adrians called the research one with "serious gaps" and argues that SimpleQA itself contains incorrect questions and does not reflect actual user search patterns. The company notes that for internal evaluations it uses SimpleQA Verified — a smaller but more carefully selected set of questions.

However, Google's position does not refute the fact of the gap between accuracy and verifiability metrics. The disclaimer "AI may make mistakes" existed before — but the scale at which this "may" occurs had not been publicly measured until this research.

Broader effect: who pays for the mistakes

Alongside the accuracy question, a separate economic problem is unfolding. Research by Pew Research Center showed that users who see an AI Overview are half as likely to click through to external sites. According to SimilarWeb, global search traffic (human) declined approximately 15% in the year to June 2025, and some publishers report click-through rate drops of up to 89%.

When AI Overviews are present in results, CTR for the top organic link drops to 8% versus 15% without the AI block
Users follow links within AI Overview in only 1% of cases
Publishers expect search traffic to decline by an average of 43% over three years

In other words, AI Overviews simultaneously generate errors and cut off traffic to sources that could correct those errors.

If Google does not disclose its own data on the actual share of search queries that receive AI Overview, and does not provide an independently verified methodology for assessing accuracy — any discussion of an "acceptable error rate" will remain a conversation with unknown variables. The question is not whether 91% is good enough. The question is whether Google is willing to show how many millions of false answers per hour it considers an acceptable price for convenience.

World News

Bail for Corrupt Officials Instead of Drones? A Candid Discussion on Survival, Trust, and Community Air Defense

1

More Than Just a Game: Fostering Patriotism and Honoring Heroes in Bilohorodka

2

The First Yeshiva of Independent Ukraine: From Laying the Foundation Amid Rocket Fire to the Grand Opening

3

THE WORLD IS ENTERING A PHASE OF CHAOS AND WAR OF EXHAUSTION

4

Serhiy Vasylyuk: “Love for one’s homeland is also a weapon”

5

The Battle of Ormuz, the Pentagon's UFO Files, and the Su-57 near Chelyabinsk — This Week's Roundup

Politics

Britain prepares sanctions against Russia's financial sector — Zelensky's office

Presidential envoy confirms: London preparing new restrictions following criticism over weakened oil price cap. Whether control mechanism will be implemented remains key question.

4 days ago

Technologies

Ferrari Luce for €550,000: The Brand's First Electric Car Was Developed for Five Years — and It Has No Touchscreen

Ferrari has officially unveiled the Luce, a four-door, five-seater electric vehicle with 1,000 horsepower priced from €550,000. The design is by Jony Ive, it features physical buttons, and Ferrari's stock fell 3% on the day of the presentation.

4 days ago

Technologies

Delete — and it will disappear from OneDrive too. But only until September 2026

Samsung and Microsoft are removing the built-in Gallery synchronization with OneDrive. It's not a disaster, but there's one behavioral nuance that few people know about: after switching to a separate application, deleting photos from your smartphone will no longer erase them from the cloud.

4 days ago

Army deserter opens fire on court officers who came to return him

Deposit as a shield against enforcement officer no longer works: what the law on digitization of collections changed

Drove Someone Else's Car to Khmelnytsky — and Found Him There

Google Answers Correctly 9 Out of 10 Times. But With 5 Trillion Queries Per Year, the Remaining Errors Mean Tens of Millions of Mistakes Every Hour

What the research showed

More accurate, but less verifiable

Google's response: methodology in question

Broader effect: who pays for the mistakes

World News

British Parliament rejects Scandinavian model: what the 2026 law means for the escort industry

Digital Revolution in Germany: How Technology and the Law Brought the Escort Industry Out of the Shadows

Dirty Divorce: US Exits WHO, Leaves a $260 Million Debt — What It Means for Ukraine's Health and Security

More Than Just a Game: Fostering Patriotism and Honoring Heroes in Bilohorodka

The First Yeshiva of Independent Ukraine: From Laying the Foundation Amid Rocket Fire to the Grand Opening

THE WORLD IS ENTERING A PHASE OF CHAOS AND WAR OF EXHAUSTION

Serhiy Vasylyuk: “Love for one’s homeland is also a weapon”

The Battle of Ormuz, the Pentagon's UFO Files, and the Su-57 near Chelyabinsk — This Week's Roundup

Britain prepares sanctions against Russia's financial sector — Zelensky's office

Ferrari Luce for €550,000: The Brand's First Electric Car Was Developed for Five Years — and It Has No Touchscreen

Delete — and it will disappear from OneDrive too. But only until September 2026

EU Against Google: Why the Latest Fine Could Change More Than Previous Ones

"Azov" strikes the Mariupol-Taganrog route not for Crimea — but Crimea is nervous too

May 26: Georgia Celebrates 107 Years of Independence — and Parallels with Ukraine No Longer Need Explanation

Don't miss what matters