- The O3 of Openai beat Elon Musk Grok 4 in Chess
- Magnus Carlsen delivered comments that have died on the quality of Grok’s logic
- Grok 4 made repeated blunders, while O3 played stable
The AI chess tournament between the O3 model of Openai and the XAI Grok 4 has invited many speculations such as a kind of proxy battle between the two companies and their respective CEOs. Any comparison with the days of Deep Blue and Bobby Fischer quickly faded, however, while Openai O3 eliminated Grok 4 several times, winning four consecutive games, accompanied by the derisory commentary by former world chess champion Magnus Carlsen and the grand-master David Howell.
The confrontation occurred on Kaggle’s Game Arena, a digital package where AI models are fighting in failures and other games. The tournament presented eight of the most important LLM in the company: O3 and O4-Mini d’Openai, Google’s Gemini 2.5 Pro and Flash, Claude Opus of Anthropic, Moshot’s Deepseek and Kimi, and Grok 4 from XAI.
Carlsen and Howell turned between serious comments and a roast while Grok’s performance turned out to be somewhat erratic. In the first match, he quickly sacrificed his bishop, then started to exchange pieces as she was in a hurry to go home. Things did not improve in the next match for Grok.
“”[Grok] is like this guy in a club tournament that learned theory and literally knows anything else, “said Carlsen during the second game.” Make the worst blunders after that. “”
Grok’s performance was so out of the rails that Carlsen evaluated it around 800 ELO, or slightly above a beginner. He gave O3 a modest but respectable 1200, in the midst of most pastime players. Although O3 did not play brilliantly, he did not have to do so. He played solid failures. He did not make a mistake. He converted his advantages and carried out conventional chess movements.
“O3 is quite ruthless in conversions; he looks like a chess player. Grok seems to have learned some opening movements and knows the rules, but not much more.” Said Carlsen. “Grok’s movements are failure -related movements. They came at the wrong time and in strange sequences.”
AI chess
Failures were not the main point of the tournament, despite its importance. It was a question of how IA models for general use manage events with strict rules such as chess games. It turns out that they are not great, but O3 is the best of the limited sample. As AI becomes integrated into everything, the ability to follow the punctual rules and models becomes essential. Failures are a unique and transparent way to observe this. Either you have made the right movement, or you did not do it. When a model plays well, you can see the logic; Otherwise, the queens fall like dominoes, and the game becomes as confused as this metaphor.
Failures are a window on how an AI can plan, assess the options, avoid catastrophic errors and remain logically coherent. If Grok throws a queen because she does not grasp the long-term consequences, what could she do in a legal document or during travel booking?
The fact that the final was between Openai and Xai added a drama with Sam Altman and Elon Musk in Loggerheads in public. The failure final has not resolved the battle between them, but it gave Openai a public relations victory in the field of public perception, and a limited but very real compliment of Magnus Carlsen.