ChatGPT 5 fails a kindergarten test and researchers call it a complete joke

September 4, 2025

ChatGPT 5 fails a kindergarten test and researchers call it a complete joke 4

Editorial Note: Talk Android may contain affiliate links on some articles. If you make a purchase through these links, we will earn a commission at no extra cost to you. Learn more.

When OpenAI announced its latest AI model, GPT-5, the bold promise was clear: a “PhD-level” artificial intelligence that would blow past limitations and redefine what machines can do. But the rollout hasn’t gone exactly to plan. Just days after its debut, GPT-5 is under fire — not for minor bugs or quirks, but for failing basic tasks that wouldn’t stump a child.Turns out, having a PhD on paper doesn’t mean much if you can’t tell where Paris is.

Image generation gone embarrassingly wrong

The biggest red flag? Visual hallucinations — a term researchers use when AI confidently generates things that are simply wrong. One widely shared example involved GPT-5 being asked to produce a map of North America. The result? A completely inaccurate blob that had geography teachers shaking their heads.

In another case, the model was tasked with creating portraits of U.S. presidents. Instead of Washington, Lincoln or Obama, it churned out a parade of fictional faces, mismatched timelines, and invented names. It was the kind of output you'd expect from an undercooked student project — not a billion-dollar AI hailed as revolutionary.

A kindergarten-level challenge GPT-5 couldn’t handle

One of the more brutal takedowns came from AI expert Gary Marcus, who decided to lower the bar entirely. Rather than complex prompts, he gave GPT-5 a challenge fit for a five-year-old. It failed.Posting the results on social media, Marcus wrote bluntly: “GPT-5 failed a kindergarten-level task. Speechless.”

In a blog post that followed, he criticised not just the AI's performance but the hype machine behind it. “It’s baffling,” he wrote, “that a company would risk its reputation on something so sloppy.”

Real-world testing reveals more gaps

Tech publications also joined in on the testing. One outlet asked GPT-5 to generate a map of France divided into its regions and label the 12 most populous cities. Instead, the model offered a strangely sliced version of the country with just six cities — and no Paris. Orléans made the list, despite ranking around 35th in population. A geography student would’ve been marked down.

For GPT-5, it’s become a meme.Interestingly, when the same prompts were given in text-only form, GPT-5 performed reasonably well. It was the image generation and visual reasoning tasks where things consistently fell apart.

Experts warn of inflated expectations

So what’s going wrong? Most researchers agree that GPT-5, while an upgrade in some areas, still suffers from AI hallucinations, particularly when tasked with creating visuals or interpreting spatial data. That’s not a minor flaw. In domains like education, healthcare or national security, bad information dressed up in confident tones can cause real harm.

And while some errors might seem funny (like moving French regions around like puzzle pieces), the broader concern is that consumers and businesses are being sold an illusion of reliability.

In short, GPT-5 may write a convincing essay, but give it a map or a basic list and it’s likely to fumble. The gap between marketing and reality is glaring — and the AI community isn’t holding back. For a tool promised to rival PhD holders, many are wondering if it’s time to go back to school. Or at least, start with preschool.

Talk Android Author

1 comment

sarefo says:

September 4, 2025 at 3:25 PM

garbage article. image generation and LLM are completely different technologies. “open”AI still sucks though :D

Reply