researchers Archives - Page 4 of 4

Some researchers claim to have created an AI as good as those of Openai and Deepseek for $ 50. And the data is real

February 7, 2025 by usatoday24

The cost of training of models of artificial intelligence (IA) More advanced is in the spotlight. And it is understandable that it is so. The irruption of the Chinese company model Deepseekwhich presumably has A moderate training costhas questioned the strategy and investments deployed so far by OpenAi, Google or Microsoft, among other companies. A brief review before moving forward: those responsible for Deepseek argue that the infrastructure they have used to train their agglutin model 2,048 chips H800 of Nvidia. And also that this process with 671,000 million parameters has cost 5.6 million dollars. However, some analysts defend that these figures do not reflect reality. The report prepared by SEMIANALYSIS He maintains that, in reality, the infrastructure used by Deepseek to train his AI model approximately 50,000 NVIDIA GPU with Hopper MicroAritecture. According to Dylan Patel, AJ Kourabi, Doug O’Laughlin and Reyk Knuhttsen, at least 10,000 of these chips are GPU H100 of Nvidia, and at least another 10,000 are GPU H800. The remaining chips, according to these analysts, are the cuts cut H20. The ‘S1’ model takes more firewood On January 31, a group of researchers from Stanford University and the University of Washington, both in the US, published in the repository of open access scientific articles Arxiv A text in which it claims to have managed to train an AI model with reasoning capacity and benefits comparable to those of OPENAI or DEPEEEK O1 models facing an investment of just under $ 50. A boat soon seems impossible. With that money a priori it is absolutely unfeasible to train an artificial intelligence model. And less an advanced and capable of competing from you to you with those of OpenAi or Deepseek. However, it is true. To understand how they have achieved it We need to investigate the strategy they have devised. On the one hand, those 50 dollars represent the cost of renting the cloud computing infrastructure to which they have resorted to carry out the training. It makes sense if the time invested is very moderate. ‘S1’ has been elaborated from the free qwen2.5-32b model developed by the Chinese laboratory Qwen But there is something else. Something very important. His reasoning model, which they have called S1, has been elaborated from the free artificial intelligence model QWEN2.5-32B developed by the Chinese Laboratory Qwen, alibaba. And its reasoning process is inspired by the GEMINI 2.0 Flash Thinking Google model. They have not left zero at all. An interesting note: the S1 model is available in GITHUB together with the data and code used by these scientists to train it. On the other hand, the training process lasted less than 30 minutes using only 16 NVIDIA H100 chips belonging to the cloud computing network used by these researchers. From here comes the cost of Somewhat less than 50 dollars. However, there is another data that is worth not overlooked: the S1 Reasoning Model has been generated by distillation of the Gemini 2.0 Flash Thinking experimental model. Distillation is, in broad strokes, an automatic learning technique that allows the knowledge base to be transferred from a large and advanced model to a much smaller and efficient. This strategy saves many resources, although it does not serve to create models from scratch. Beyond the caraded 50 dollars of cost, the really important thing is that, as we have just verified, it is possible to put to tuning models of very competitive facing a much more restrained investment than those made by the large technology companies so far. Image | Luis Gomes More information | Arxiv | GITHUB In Xataka | Samsung is preparing to give TSMC a bars where it hurts most: the manufacture of the chips for ia

“They are brilliant researchers under the control of an authoritarian government.” Anthropic’s CEO has spoken about Depseek

January 29, 2025 by usatoday24

In the midst of the stir caused by the latest models of the Deepseek, the CEO of Anthropic, Dario Amodei, has published An analysis on its personal website in which it questions the narrative of the “Chinese miracle” in artificial intelligence. Why is it important. The debate on Chinese capacity to develop advanced AI has monopolized the agenda in recent days after Deepseek’s releaseswhich have come to provoke A 17% drop in Nvidia shares. The facts. Deepseek claims to have developed its model V3 for just under 6 million dollarswhile Amodei explained that Claude 3.5 Sonnetthe last and most advanced Anthropic model, required “some tens of millions” in training. Far from the “thousands of millions” that were speculated. “Deepseek has produced a model close to the performance of US models 7-10 months ago, for a rather lower cost, but not in the proportions that have been suggested,” said the CEO. Deepseek operates with about 50,000 generation chips Hoppera capacity that Amodei considers similar to that of the main American technological ones. According to his analysis, Deepseek’s advances reflect the natural reduction of costs in the sector, estimated at annual 75%. The context. Deepseek has presented two models: V3, which uses traditional training. And R1, which incorporates reinforcement learning. For Amodei, real innovation is in V3, not in R1, which according to him, follows roads already explored by other technological ones. Turning point. The development of an AI superior to human intelligence will require millions of chips and tens of billions of dollars in the coming years. “Between 2026 and 2027 we will see which will be smarter than almost all humans in almost all tasks,” he said. In this scenario, he has defended Export controls as a strategic tool. Amodei has also recognized the talent of Deepseek engineers … although he has warned about the implications that a company operates under the control of the Chinese government. For him, The growing efficiency In the development of AI justifies reinforce, and not relax, commercial restrictions. In fact, he has had some words of praise for Depseek’s team, but not for his nation: “They are brilliant and curious researchers who only want to create useful technology, but are subject to an authoritarian government that has committed human rights violations and He has behaved aggressively on the world scene. “ In Xataka | “Google gives you links, perplexity gives you answers”: we talk to the CEO of the startup that wants to kill the father Outstanding image | Techcrunch

Log In

Sign In

Forgot password?

Your password reset link appears to be invalid or expired.

Log in

Privacy Policy

Add to Collection

No Collections