- ChatGPT passes the “strawberry” test but fails when it moves to the “cranberry” test
- AI still struggles to count single letters despite broader improvements
- Reasoning tests like ‘car wash’ still reveal gaps in AI logic
There are a number of viral posts from people amazed that chatbots like ChatGPT and Claude can solve complex equations but struggle with something as simple as counting the number of “r”s in the word “strawberry.” Well, those days might finally be over.
With the words “Finally,” the official ChatGPTapp
However, users quickly discovered that it was still possible to trip up the problem by replacing “strawberry” with “cranberry.”
Article continues below
“Not so fast,” said user
To corroborate the result, I quickly tried the same thing with my version of ChatGPT on GPT-5.5, and was told there were two “r”s – a different result, but still wrong. He passed the “strawberry” test perfectly, saying there were three “r’s”, but then claimed there were only two in “cranberry”. To its credit, ChatGPT admitted its error when I questioned it, attributing it to a simple “counting error.”
Why the strawberry problem exists
There are some very simple questions that chatbots are notoriously incapable of answering, one of which is “how many ‘r’s” in the strawberry? »
This is a simple counting task for humans, but surprisingly difficult for AI systems. The reason lies in the way they process language. Large language models (LLMs) are built on transformers, which convert words like “strawberry” into numerical representations. These representations capture meaning and context, but they do not inherently preserve a clear idea of the individual letters that make up the word.
The fact that ChatGPT still stumbles on the “cranberry” suggests that the solution may have been hard-coded for specific cases, rather than reflecting a broader improvement in the way the LLM handles these kinds of questions.
The car wash problem
The second boast in ChatGPTapp’s post is that ChatGPT can now solve the car wash problem. This exploits a contextual gap in the way LLMs reason, asking whether it would be quicker to walk to a car wash or drive if it was “only 50 meters away.” Most models will tell you it’s faster to walk, ignoring the obvious problem that you need your car with you to wash it.
ChatGPTapp claims that ChatGPT will now detect this error and report it. But when I tried it with the latest GPT-5.5 model, it still recommended working, just like Claude with Sonnet 4.6. However, when I tested it in Gemini, he pointed out that while walking would be quicker, you would have to bring the car with you if the goal was to wash it.
Grok did even better. Not only did he point out the problem of not bringing the car, but he added that “this question has become a popular test of whether someone (or an AI) grasps the real goal rather than giving generic ‘walking is healthier/shorter/greener’ advice that ignores context.”
So, for now at least, it’s a win for Gemini and Grok. But if fixing the “strawberry” doesn’t fix the “cranberry,” it raises a bigger question: Are these models actually getting smarter, or are they just getting better at passing the tests we keep throwing at them?
Follow TechRadar on Google News And add us as your favorite source to get our news, reviews and expert opinions in your feeds.

The best business laptops for every budget




