GPT-4.5 It is not better than its rivals in almost anything. It is the proof that traditional AI models almost do not advance

Sam Altman I had already warned that they planned to launch GPT-4.5 very soon. We had been waiting for the GPT-4 successor for months, but over time expectations have been going down: there was talk of the Founder of AI And how climbing – more data and more GPUS to train models— It didn’t work so much. Precisely GPT-4.5 was going to be proof that perhaps that was not true. Do you know what? That was probably, because GPT-4.5 is a model with many starting problems. GPT-4.5 is already with us. Yesterday Openai finally presented GPT-4.5the theoretical successor of GPT-4. Sam Altman explained that this was “the first model that makes me feel that I am talking to an attentive person.” Gigantic and expensive. But Altman also recognized something else. “Bad news: it is a giant and expensive model.” The head of OpenAI claimed to have run out of sufficient GPUS to make a mass launch, and the availability of GPT-4.5 is very limited: only Chatgpt Pro users can use it for the moment. Caro no, very expensive. Using the GPT-4.5 model through the OpenAi API is extraordinarily expensive: it costs $ 75 per million input tokens, and $ 150 per million output tokens. GPT-4O costs 2.5 and 10 dollars respectively (30 and 15 times less), and O1, so far the most expensive, costs 15 and 60 dollars respectively. And it is also not a “border” model. He Technical Report OpenAi indicates that this It is not a model “Frontier“ As was GPT-4, for example. That is important, because despite being its largest LLM, the border models are more capable, of large scale and raise risks to generate misinformation or be forced to get out of the standards. In GPT-4.5 they seem to have focused a lot on avoiding errors (it is one of its advantages, it seems to put less the leg according to some test banks). It does not seem better in almost anything. The evidence and benchmarks to which it has been subjected seems to make it clear that the leap into benefits is especially disappointing, especially if we compare it with the new models of its rivals. Is worse in accuracy of the facts that perplexity deep researc, is worse than Claude 3.7 Sonnet in programming According to TechCrunch and Several expertsand it is also worse in reasoning (although it is certainly not oriented to it) than Deepseek R1, O3-mini or Claude 3.7 Sonnet (which is a “hybrid” model). Bittersweet feeling. Experts like Simon Willison either Andrej Karpathy They have shared their first impressions and in both cases the sensation is that GPT-4.5 It is slowis updated only Until October 2023 And it does not represent a really remarkable advance. Willinson came to analyze the debate that dozens of users maintained on GPT-4.5, and in a Summary generated by AI The conclusions were also clear: the numbering itself was inappropriate, the model is too expensive, the price/benefits ratio was very debatable and the performance was not what was expected after so much time. Karpathy’s conclusion is that “it is a little better and that is great, but not exactly in trivial sections of highlighting.” More human? Altman’s appreciation about his conversation how he had been surprised by the conversation capacity of GPT-4.5 Maybe he points to the direction in which this model stands out. Karpathy also pointed to that aspect in saying that the improvement could be shown in “creativity, realization of analogies, general understanding and humor”, which perhaps makes effects effectively with GPT-4.5 give the feeling of being even closer to those we would have with a human being. The climb does not work, the deceleration is here. GPT-4.5 It is a clear example of how we have reached the limits of the scaling. Having a gigantic LLM no longer seems to provide advantages over its predecessors, and dedicate more data and more GPUS to train these models does not seem to make much sense. Altman himself made it clear that GPT-4.5 would be the latest non-reasoning model of the company. That is another sign that demonstrates that the deceleration of the generative AI, at least in regard to traditional models, is a reality. Why have you launched it then? In it OpenAi blog It indicates how “we are sharing GPT-4.5 as a research advance to better understand its strengths and limitations. We are still exploring what they are capable of and we are eager to see how people use it in ways that we would not have expected.” That seems to show doubts that their own creators have with the model, and the question why they have thrown it. They need to continue generating “Hype “. Especially considering that the rivals are very strong lately. Claude 3.7, Grok 3 and of course Deepseek R1 have managed to turn the tortilla and raise a challenge for Openai, which until not long ago seemed to be a step ahead of their rivals. Now that is not clear, and in many sections its competitors already exceed the benefits of their models. OpenAi needs to breastfeed and say “here I am”, but perhaps with GPT-4.5 that movement goes wrong, because at least a priori the benefits are disappointing. And investors squeeze. Some point to another probable theory for this launch. OpenAi could have been forced to launch GPT-4.5 make investorsthat have invested billions of dollars in the company and that need to be calm with their investment. Once again OpenAi has a problem, because it does not seem that GPT-4.5 can leave them calm. It will be difficult for new investors to be convenient with this launch. In Xataka | Openai has a golden opportunity to sweep all its rivals: launch an unlimited chatgpt and full of advertising

All their rivals offer free models that “reason” and Gemini 2.0 is the last example

All the companies and startups of AI in the United States were so quiet going to their own. And suddenly Deepseek R1 arrived and became a true existential threat to Silicon Valley. The Chinese startup offered a model of reasoning as good as that of its competitors, but also offered it for free (and Open Source!). What has Silicon Valley did? Apply the story, of course. GEMINI 2.0 Razon Free for All. Enough that you visit the Official Gemini website and display the “Gemini” menu from the upper left to check it. You can already use 2.0 Flash Thinking Experimental (its reasoning model) both in normal mode and in “collaborative” mode with services such as YouTube or Maps. And it is totally free. Microsoft Copilot and Think Deper. Microsoft Copilot’s “Think Deper” mode is also available for free In this service of the company. As we explainThink Deper is actually OpenAi O1, but before Microsoft had to pay the subscription of Copilot Pro ($ 20 per month) to enjoy access to that option. The appearance of Deepseek R1 caused it to also offer it in a grauita way (although with a more limited number of consultations). OPENAI O1. The company led by Sam Altman didn’t want to be left behind and less than a week ago presented O3-minia reasoning model that in addition to being especially powerful is available in the Grauita version of chatgpt. We can activate the “Reason” button so that when we ask something, the O3 reasoning capabilities are put into operation. Deepseek R1 and perplexity. Perplexity’s search engine is gradually offering new options. In fact, a few days ago those responsible announced that On the perplexity website We could activate the Reasoning-R1 model based on Deepseek R1, but housed in the US (to avoid suspicions with possible data theft). They even give the option of opting for the Reasoning-O3-mini model, which is the same offered in Chatgpt. Again for free (although limited), but that stands out for being a comfortable way to try Deepseek R1 in its most powerful version. And the rest? This first batch of reasoning models seems to have taken on foot changed to the rest of the great contenders in the AI ​​segment. Anthropic, who is still a reference with Claude, has not launched a reasoning model at the moment. He has not done so Apple, who goes to his own pace. Meta has not launched anything in this regard despite offering a flame as a clear reference of the Ia Open Source model. And Elon Musk seems to be very busy, because Xai is still working In Grok And for the moment there is no news about a potential variant of reasoning. The only remarkable alternative for the moment is Doubao-1.5-Prothe reasoning model fresh by Bytedance, although it is not available as simple as its competitors. The competition benefits users. The impact of Deepseek R1 on the AI ​​segment has been spectacular as we see. When Openai launched O1 In September 2024 he did it by raising him as a very advanced option but also face: only the subscribers of his services could access it in a limited way. Four months later we are using models that rival O1 but that are totally free and that we can use with more and more options. They are great news for users, which at least for now are benefiting from all that rivalry between these companies. The AI ​​that reasons every time is better and cheaper (or free). A graph Created by Shawn Wang (@swyx) and published in his Newsletter, Latent Space, shows a clear evolution of AI models. In that graph you see how its capacity (measured at LMSYS points, a well -known ranking of AI models) is confronted with its cost per million tokens (ratio 3: 1 entry: exit). Here the right and the right is a model, the better, and Gemini 2.0 Flash Thinking seems to be especially well positioned, but this type of graph is changing very quickly. Again, more good news for us, users. In Xataka | Mistral AI is the French startup that opted for efficiency before Deepseek. His future is uncertain

Log In

Forgot password?

Forgot password?

Enter your account data and we will send you a link to reset your password.

Your password reset link appears to be invalid or expired.

Log in

Privacy Policy

Add to Collection

No Collections

Here you'll find all collections you've created before.