I have tried Grok 3 and it is really intelligent and fast. The problem is that this is no longer enough

I’ve spent a few hours trying Grok 3the new version of Xai AI. I wanted to see their real abilities and especially how it behaves, what kind of results, in front of Chatgpt, Claude, Le chat, Deepseek

Reasoning and problem solving

  • It stands out in mathematical problems. I made him complete the challenge Aime’24of which he was 6 of the 15 problems, for the 9 hits of OPENAI O3-mini-high. In addition, Grok 3 took a little less than five minutes, but O3-mini-High took almost six. Very striking to see their self -assessments until you find the correct answer (although sometimes it was not).
Cleanshot 2025 02 21 at 11 54 17 2x
Cleanshot 2025 02 21 at 11 54 17 2x

A fragment of the steps that Grok 3 took to evaluate their own conclusions before presenting them as a final result. Image: Xataka with Grok 3.

  • In basic reasoning tests, such as determining the number of repeated letters in somewhat complex words (the classic “lollapalooza”) or comparing decimals (9.11 vs 9.9), Grok 3 responds correctly after a few seconds of visible “thought”.
Cleanshot 2025 02 21 at 11 35 47 2x
Cleanshot 2025 02 21 at 11 35 47 2x

O3-mini-High Be right after 6 seconds. Image: Xataka with chatgpt.

Cleanshot 2025 02 21 at 11 35 56 2x
Cleanshot 2025 02 21 at 11 35 56 2x

Grok 3 also succeeded, but after more than four times longer. Image: Xataka with Grok 3.

  • In a Greek mythology question about Jason’s maternal great-grandfather, Grok 3 found the correct answer in 18 seconds … while O3-mini-High needed 22 seconds to fail. Well played, Grok.
Cleanshot 2025 02 21 at 11 31 44 2x
Cleanshot 2025 02 21 at 11 31 44 2x

O3-mini-High missed. Image: Xataka with chatgpt.

Cleanshot 2025 02 21 at 11 31 53 2x
Cleanshot 2025 02 21 at 11 31 53 2x

Grok 3 instead gave a better built response, in addition to correct. And it takes less. Image: Xataka with Grok 3.

Search and synthesis

  • Its function Deepsearch is fast but sometimes it is not entirely precise and does not mention any important detail. I asked to analyze the impact of AI on chips design and, although it generated a text of 1,504 words with several quotes in just over a minute, omitted to mention important advances such as the framework ALPHACHIP from Google. In later and insistent attempts he did.
  • I also asked for a full report on Xataka On financial, media, reputational level, etc. It was quite successful, although it showed an inherent limitation of any Deep Research system: He knows a lot about what is in public, but he doesn’t have many Insightsit lacks the expert criteria that knows not only the public, but what underlies. This is something of Grok and any other with Deep Research. When you ask for information about something you do not control, it is easy to assume that Deep Research (or in this case, Deepsearch) gives you everything. When you are in garlic it is easy to detect deficiencies. As in this example.
Cleanshot 2025 02 20 at 16 02 35 2x
Cleanshot 2025 02 20 at 16 02 35 2x

Image: Xataka with Grok 3.

  • The speed impresses: it is remarkably faster than Deep Research of OpenAi … but at the cost of sacrificing depth for speed. Of course, your selection of sources and appointments is usually really good.
  • Unlike Gemini, it does not allow exporting reports directly to documents or customize the research approach. Again: Grok is very intelligent and capable, in his own way, but he lacks a product. Little llm is of little use if it forces to start from scratch and process all the information.

Creativity and tone

  • To try his creative writing I asked him for a story about a time traveler facing a paradox. The result was quite solid in construction of characters, details, descriptions and atmosphere, overcoming even the one that I consider the best in that aspect, Claude 3.5 Sonnet. Of course, some plot turns seem quite forced.
Cleanshot 2025 02 21 at 11 22 18 2x
Cleanshot 2025 02 21 at 11 22 18 2x

Image: Xataka with Grok 3.

  • His humor is basic and predictable, limited almost all the time to quite obvious words games. Teenage humor If the concept of disturbing valley You can move to a chatbot, Grok 3 is in that 99%: too fine to look like a caid robot, too predictable to finish convincing.
  • It maintains political neutrality even on issues such as immigration or trans rights. Musk says it can be politically incorrectbut it seems that it is something that has more to do with what the user requires that a feature of his personality. That is to say: it can be taken out of correction, but only when the user pushes to it.

Some limitations

  • It does not allow customizing the behavior of the model, unlike chatgpt; or the response style, as Claude allows.
  • It is limited to being a text drawer. You are only accompanied by the buttons to attach a file, activate your Deepsearch or activate your reasoning mode. That, and a few elementary instructions. No projects such as Claude or GPTS of Chatgpt, or the agents of Le Chat. In short: Nothing to retain pre -established contexts and guidelines or documentation to facilitate work. We always have to start from a new canvas.
Cleanshot 2025 02 21 at 11 38 49 2x
Cleanshot 2025 02 21 at 11 38 49 2x

The interface is good, intuitive, simple … but they miss tools that make it more versatile and appetizing to integrate it into our day to day. It is powerful and capable for specific uses, but the product built around chatgpt, claude or le chat (projects, agents, previous instructions, etc.) make those alternatives something much more interesting for serious and recurring use. Image: Xataka with Grok 3.

  • The security keeps are stricter than those of Grok 2. with that version We hallucinate for its lack of scruplesbut Grok 3 seems to recover them: he refused to generate a template for a mass fraud campaign by mail simulating that I am a Valencian prince in search of heiress.
  • The generation of images does seem, again, more lax. Midjourney does not allow to create anything containing the words “Donald Trump” or “President of the United States”. Nothing. Grok 3 does not put so many objections. Not even with its owner.
Cleanshot 2025 02 20 at 17 48 00 2x
Cleanshot 2025 02 20 at 17 48 00 2x

Image: Xataka with Grok 3.

You can try Grok 3 from its official website or from Its integration into x (which is why you have seen two somewhat different interfaces in this article). It is temporarily free, but we already know that it will be one of the reasons to pay a subscription to X, and not the cheap.

Its ability is undeniable, but we have so much offer of similar alternatives that being a bit more intelligent or fast is not differential. The difference is the product, and that is where Grok 3 has more margin of improvement.

Outstanding image | Xataka with Mockuuuups Studio

In Xataka | Deep Research is not just a new AI function. It is the beginning of the end of intellectual work as we know it

Leave your vote

Leave a Comment

GIPHY App Key not set. Please check settings

Log In

Forgot password?

Forgot password?

Enter your account data and we will send you a link to reset your password.

Your password reset link appears to be invalid or expired.

Log in

Privacy Policy

Add to Collection

No Collections

Here you'll find all collections you've created before.