He is 23 years old, his name is Liam Price and he has no advanced mathematical training. Even so, a few days ago he opened the Erdös problem websitepicked one at random and pasted it into ChatGPT. I didn’t know the history of the problem or who had tried it before. What he received back seemed like a good solution, and after consulting with a friend who was studying mathematics, the two realized they might be on to something special. A few hours later Terence Tao, one of the most renowned mathematicians in the world, confirmed that problem #1196 of Erdös, a conjecture about primitive sets of integers that had not been solved since 1966, had a solution. I had found her GPT-5.4 Pro in just 80 minutes.
Not like that. This problem analyzed a question about the behavior of a particular mathematical sum on primitive sets, that is, sets of integers where none divides the other, when those numbers become very large. Jared Lichtman, a Stanford mathematician, had spent years on the problem and had made partial progress, but he and those who had tried before were starting from the same starting point that seemed like the right path.
A novel idea. GPT-5.4 used another starting point. He stayed in the airmetic terrain and used a special function called von Mangoldt functiona classic tool of number theory known for its connections to prime numbers and Riemann zeta function. No one had thought about that approach to the problem, and as Lichtman explained when talking about the OpenAI model solution, “The LLM took a completely different route.”
The achievement is real, but with nuances. Litchman praised the proposed solution by GPT-5.4, but there is one detail that has been omitted in many comments on this event: the raw output of ChatGPT was, in the words of this mathematician, “pretty poor.” This solution made it necessary for several experts to interpret it, detail it and extract from it the underlying idea that allowed the conjecture to be solved. Price didn’t know he had the solution until his friend read it, and he wasn’t sure until Tao confirmed it. The official repository of AI contributions to Erdös problemsmaintained by Tao himself on GitHub, classify the result as a solution generated in human-AI collaboration, not as a solution developed solely by AI. The distinction is important.
A previous scandal. A few weeks ago Sebastien Bubeck, a researcher at OpenAI, posted on X that GPT-5 had “solved” several Erdös problems. That publication exceeded 100,000 views, but the mathematical community and also that surrounding the AI industry criticized that statement. Demis Hassabis, CEO of DeepMind, called that statement “shameful.” What had actually happened is that the model I had found solutions to already solved problems on the web. Bubeck finished deleting the original tweet and tried to back down, but all this raised doubts about the validity of the application of AI to solve mathematical problems.
AI and the mathematical success rate. Terence Tao and Nat Sothanaphan maintain the aforementioned record of all AI contributions to Erdös problems on GitHub. Each of the entries in that list or table is classified with a traffic light: green for complete solution, yellow for partial progress, and red for failure. In the category of completely AI-generated solutions with no known prior literature there are three green, fourteen yellow, and eight red traffic lights. However, the repository itself adds a unique comment: those who try to use AI to solve these problems and fail do not usually report it, so it is likely that AI has been applied “silently” to a large number of these problems without success, and those attempts do not appear in any table. There is a clear bias here because only successes generate headlines.
Trying to measure what matters. In February 2026, eleven mathematicians created the initiative “First Proof“. In it they included ten mathematical problems that arose naturally in their research projects. For each one they included encrypted answers uploaded to a verification site, and they gave the AI systems a week to try to solve those problems that had never appeared in any training data set. Preliminary results indicate that today AI models cannot overcome that barrier autonomously, and what happens is that there are still limits to what AI can really contribute in mathematics.
But then, is there a revolution or not? Terence Tao offered a clear explanation as to why GPT-5.4 had succeeded where others had failed for 60 years. What had happened was what he described as a collective blockage of the mathematical community, because everyone started from the same origin because it was “the natural one”, the one marked by tradition. The AI didn’t know that was the “correct” way to start, and that ignorance turned out to be an advantage. It’s not that the AI was smarter, it’s that it had no biases about how to approach the problem. Now it remains to be seen if this novel way of trying to solve problems in unorthodox ways works. This will confirm whether what happened with Erdös’s problem number 1196 was an isolated case or whether a 23-year-old boy has managed to change our vision of how to tackle mathematical problems.
Image | Universal Pictures
In Xataka | There is a mathematically perfect way to cut a ham and cheese sandwich and it has been discussed since 1938.

GIPHY App Key not set. Please check settings