5 Sep 2025

GPT-5 flunks Kindergarten test despite PhD-level promise

Just days after its debut, GPT-5 has been mocked for failing simple tasks, such as correctly mapping North America or identifying US presidents.

Critics quickly derided OpenAI’s newly released GPT-5 for failing tasks that a five-year-old could ace, raising questions about the disparity between hype and performance.

Despite being promoted as ‘PhD-level’, the model produced a distorted, blob-like map of North America and invented mismatched portraits of US presidents with fictional names.

AI researcher Gary Marcus lowered the threshold by giving GPT-5 a kindergarten-level challenge. The result was a clear fail. He posted: ‘GPT-5 failed a kindergarten-level task. Speechless.’ He criticised the rushed rollout and the hype that may have obscured the model’s visual reasoning weaknesses.

Further tests exposed inconsistencies: when asked to map France and label its 12 most populous cities, GPT-5 returned inaccurate or incomplete results, omitting Paris entirely and naming Orléans despite its lower ranking.

Oddly, when the same queries were posed in text-only form, the model performed better, highlighting the weakness in its image generation and visual logic.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!