How the new “extended” thought of Claude 3.7 compares to the reasoning of Chatgpt O1

Anthropic has just published a new model called Claude 3.7 Sonnet, and although I am always interested in AI’s latest capacities, it is the new “extensive” mode that has really attracted my eye. It reminded me of how Openai made his debut for the first time his O1 model for Chatgpt. He offered a way to access O1 without leaving a window using the Chatgpt 4O model. You can type “/ reason” and the Ai chatbot would use O1 instead. It’s superfluous now, although it still works on the application. Be that as it may, the deeper and more structured reasoning promised by the two made me want to see how they would do each other.

Claude 3.7’s extended mode is designed to be a hybrid reasoning tool, giving users the possibility of switching between quick and conversational responses and problem -by -step problem solving. It takes time to analyze your prompt before providing your response. This makes it ideal for mathematics, coding and logic. You can even refine the balance between speed and depth, which gives it a time limit to think about its answer. Anthropic positions this as a way to make AI more useful for applications of the real world which require a resolution of methodical problems in layers, as opposed to the responses at the level of the surface.

Access to Claude 3.7 requires a subscription to Claude Pro, so I decided to use the demonstration in the video below as a test instead. To question the extended method of reflection, Anthropic asked the AI to analyze and explain the popular and vintage probability puzzle known as the Monty Hall problem. It is a delicately delicate question that cinnamon many people, even those who consider themselves well in mathematics.

The configuration is simple: you are on a television game and you asked to choose one of the three doors. Behind one is a car; Behind the others, goats. At a whim, Anthropic decided to go with crabs instead of goats, but the principle is the same. After making your choice, the host, who knows what is behind each door, opens one of the other two to reveal a goat (or crab). You now have the choice: Stay with your original choice or go to the last unreal door. Most people assume that it does not matter, but in contrast, change actually gives you 2/3 chances of winning, while you stick with your first choice just leaves you a probability of 1 /3.

Choice of Crabby

Claude 3.7 Sonnet with extensive reflection – YouTube

To watch

With an enabled extensive reflection, Claude 3.7 adopted a measured and almost academic approach to explain the problem. Instead of simply stating the correct answer, he carefully presented the underlying logic in several stages, stressing why the probabilities change after the host reveals a crab. It was also not content to explain in terms of dry mathematics. Claude has traveled hypothetical scenarios, demonstrating how the probabilities took place on repeated tests, which makes much easier to understand why switching is always the best movement. The answer was not precipitated; It was as if a teacher travels in a slow and deliberate manner, ensuring that I really understood why the common intuition was wrong.

Chatgpt O1 offered a large part of a breakdown and explained the problem. In fact, he explained it in several forms and styles. In addition to the basic probability, it was also passed through game theory, narrative opinions, psychological experience and even an economic break. If anything, it was a bit overwhelming.

Gameplay

It is not all of Claude’s extensive thought, however. As you can see in the video, Claude could even make a version of the Monty Hall problem in a game that you could play directly in the window. Trying the same invite with Chatgpt O1 did not do the same. Instead, Chatgpt wrote an HTML script for a simulation of the problem I could record and open in my browser. It worked, as you can see below, but I took a few additional steps.

(Image credit: anthropic)

Although there are almost certainly small quality differences depending on the type of code or mathematics on which you work, the extensive reflection of Claude and the O1 model of Chatgpt offer solid analytical approaches of logical problems. I can see the advantage of adjusting the time and depth of the reasoning that Claude offers. That said, unless you are really in a hurry or demanded an unusually heavy analysis, Chatgpt does not take too much time and produces a lot of content from its reflection.

The ability to make the problem as a cat simulation is much more notable. This makes Claude more flexible and powerful, even if the real simulation probably uses a code very similar to the HTML written by Chatgpt.

Must Read

Leave a Comment Cancel Reply