Of algorithmic trader to revolutionize AI. This is the story of Liang Wenfeng, the founder of Deepseek

In just two weeks, an unknown 40 -year -old Chinese engineer has shaken Silicon Valley and has monopolized the world conversation about AI. It has even caused a global stock earthquake. Liang Wenfeng has achieved something that seemed impossible: Develop an AI model that rivals Openai to a fraction of its cost. If we move the focus, what has actually done is to launch an order to the American domain in Ia. Liang is not a technological entrepreneur to use. Born on the southern coast of China, in Zhanjiang, he began his career studying electronic engineering at the University of Zheijang. With Bright notes. In 2015 co -founded High-Flyera quantitative investment fund that managed more than $ 13,000 million using algorithms of Machine Learning To operate in the stock market. What distinguishes Liang is His unusual way of guiding his career. While most Chinese companies of AI focused on marketing products, He opted for research Pure and hard. “In the last thirty years, (the Chinese technology industry) has only emphasized to make money and ignored innovation,” he told China Waves as collected Reuters. “Innovation is not driven only by the business, it also needs curiosity and desire to create.” This vision materialized in 2021, when it began to accumulate thousands of Nvidia chips for a project without name, just before the United States restricted its sale to China. Two years later he founded Deepseek with just over one million dollars of initial capital. Today, They say local mediaDeepseek has only 140 employees. That is 10% of the OpenAI size, for example. Deepseek’s success He has driven Liang in his country. On January 20, he was in a closed door with Li Qiang, the prime minister. Liang was the youngest in the room. His meteoric ascent, going from a very limited fame to his field to being the epicenter of the global technological conversation in a few days, he has also put him in the trigger for those who question that Deepseek has been able to develop V3 and R1 only with the declared infrastructure. This is what Alexandr Wang, CEO of Scale AI, suggested in statements to CNBCwhen he assumed that his access to Chips had been much older but could not admit it for commercial restrictions. Dario Amodei, CEO of Anthropic, It was more comprehensive and even magnanimousbut not condescending. For Liang, The goal goes beyond competing with Silicon Valley. As explained to 36krseeks that China “gradually transit” to be a beneficiary to a taxpayer in the AI ​​industry. “What we see is that China cannot always be in a follower,” he said. “We often say that there is a gap of one or two years between China and American AI, but the true gap is between originality and imitation.” With a discreet profile and a disagreement image That has nothing to do with Altman, Zuckerberg and company, some liang companions They have described it as a pragmatic leader most motivated by his curiosity than for wealth or fame. It fits with seen so far. His commitment to research rather than by commercial applications reveals A certain background of your personality: It puts curiosity for long -term knowledge to immediate benefits. And perhaps that is what has changed the role of China in the Global Race of AI. Outstanding image | X, Xataka, Deepseek In Xataka | I have tried Deepseek on the web and in my Mac. Chatgpt, Claude and Gemini have a problem

What is High-Flyer, the Chinese fund that drives Deepseek and has been using AI for years to make investment decisions

Deepseek is the fashionable artificial intelligence (AI) company. Your most recent language models They have challenged Openai’s leadership and have caused a real earthquake in the technology industry. These days we have known that It was founded in May 2023 and that has developed its products with a fraction of the computing capacity of some of its main western rivals. But what else is known? Let’s see it. The promising present of Deepseek is the result of years of investigation that began long before its official constitution. Its origin is found in High-Flyer, a quantitative investment fund created in 2015 by the Electronic Engineering student Liang Wenfeng with two classmates. As they count on their websitethe idea was that the algorithms became the heart of their business by allowing real -time operations. A company focused on the Chinese stock market High-Flyer completed its first stock market assisted by AI in October 2016, a movement that triggered an unstoppable effort to continue working in that regard. The company formed software and hardware research and development teams. And apparently it was the appropriate decision. In 2017 I already applied AI In almost all its strategies of quantitative investment, but to continue advancing I needed to break some barriers. They discovered that complex models training tasks required a huge calculation power. This did not discourage them and in 2019 they launched a dedicated division called High-Flyer ai to address the challenge. The group built started working with 500 GPU, then built a 1,100 GPU supercomputer A100 of NVIDIA And in 2022 he spent 140 million dollars to raise the number up to 10,000 GPU, before the entry into force of the export controls of the United States. High-Flyer was completely focused on developing its algorithmic trading business. He had his own deep learning training platform and a Outstanding computer infrastructure. Meanwhile, in the United States there was a company called Openai that bet on the generative AI and that He had surprised many with the benefits of his GPT-3 language model. As China Talk collectsLiang wanted to go beyond finance. For a long time he had been convinced that AI would change the world, and had found the opportunity to bring his effort to the next level. In 2023, High-Flyer announced that it would lay the foundations of a new organization to advance the development of general artificial intelligence (AGI). Thus Deepseek was born, with an injection of capital of high-flyer. Deepseek is a product of High-Flyer work and has obviously drunk this company. Both signatures share offices in the same building, although they seem to use different computing resources. The AI ​​startup says it has H20 chips, that are sold as donuts in Chinaand NVIDIA H800, and that has used only 2,048 GPU of this latest model to train its most recent models, an affirmation that some have questioned. Images | High-flyer | Deepseek In Xataka | “They are brilliant researchers under the control of an authoritarian government.” Anthropic’s CEO has spoken about Depseek

Spain was going to invest a fortune in data centers. And then Deepseek arrived

Data centers They looked like the new gold fever in Spain. Recent data revealed how investments of Various Big Tech They promised to significant this market. Artificial intelligence promoted all those efforts, but these days some companies are rethinking what to do. The reason is, of course, Deepseek. Deepseek. The arrival of the models Deepseek v3 in November and Deepseek-R1 Just a few days ago it has made all these investments now questioned for a single reason. It may not be necessary to spend so much money. Chinese models of this startup seem to show that the same can be achieved (or more) With much less. “Unrealistic”. As revealed expandingSpain could attract more than 43.7 billion investment until 2030. However, sources in the sector have indicated in that economic newspaper that some millmillonarium projects to build large data centers in Spain were “unrealistic” and there is talk of figures that were not even guaranteed by the funds. Market adjustment. The Search for efficiency You can make these analyzes a certain adjustment in the market. Both investment funds and risk capital companies can now show more prudence when investing. Spaindc provided for the arrival of 58,000 million euros to the data centers sector until 2030. But there will be (a lot of investment). Although it seems clear that there will be a review of the budgets in various projects, as long as the need for the creation of data centers will continue to exist. The demand, due to the rise of AI and cloud services, will be remarkable. Great plans. In our country ACS has for example plans to invest globally 60,000 million euros (12,000 million in Spain). Merlin, another of the leading companies in this sector, announced the promotion of two megacampus in Extremadura with 1 GW capacity in each. Repsol, also cited expanding, It has plans to invest 4,000 million in Spain in this area. Long -term optimism, but short caution. The movements that have occurred these days after the impact caused by Depseek seems to have made many companies be recalibrating their short, medium and long term plans. However, it is impossible to know what the impact of medium -term Deepseek will be. Some experts pointed out In the Independent How they were not entirely convinced of the real efficiency of Deepseek, but they admit that that will certainly force a potential rethinking of the data centers. Image | Amazon In Xataka | We have calculated how much money the Big Tech are being spent on data centers. The numbers are dizzy

How to use Deepseek to search on the Internet and see the sources used in the answer

Let’s explain How to look for things online using Deepseekthe popular Chinese artificial intelligence. In this case we will use its official website, since if you opt for Install Depseek on your computer with Ollamathen you will not be able to use this function. What we are going to teach you is to ask for this thing, and what Look for results on the Internet. Deepseek will generate an answer from them, but you will also be able to look at the sources that you have used and enter the articles. Search the Internet with Depseek The first thing you have to do is Enter the Chat Website with Deepseekwhose address is chat.deepseek.com. This will take you to the screen where you can start a new conversation with AI. Once on this screen, search in the button Search That appears under the writing field. You can combine this button with Deceptk R1, depending on whether you want the AI ​​reason or not about the questions. Once you have marked the option Searchlook for what you want to find. When you ask Depseek, it will take a few seconds in search for sources to extract informationand then it will compose an answer. When you do, in each paragraph you will see a series of numbers, and above all you will have a message Found XX Results which indicates how many pages he has used as a source. The numbers at the end of each paragraph indicate which sources have been used for that text fragment. If you pass the mouse on one of those numbersthen a window will be displayed with the source, and you can click on it to enter the article. If you click on the message Found XX Resultsthen a column will open to the left where you are going to show A list of all used online items To compose the answer. And so, you can click on each of them to review the information or expand it. In Xataka Basics | Deepseek history: how to see or erase everything you have asked artificial intelligence

There will be a before and after Deepseek. We already know why this is so efficient

The publication of the V3 model of the artificial intelligence (AI) Deepseek as an open source is a blessing. And it is because little by little we are knowing in detail the strategy of the engineers of this Chinese company to Take a model of so efficient. Before moving forward with this article, it is important that we keep in mind that Depseek says that has trained his model using only 2,048 chips H800 of Nvidia. Some analysts defend that, in reality, its infrastructure brings together 50,000 GPU H100 Buy through intermediaries, but for the moment it is just a conjecture. This chip is more powerful than the H800, but it is perfectly credible that Depseek has been forced to settle for the latter because The sanctions of the US government They have prevented Chinese companies from access to the H100 GPU. In fact, since November 2023 Nvidia cannot deliver To your Chinese clients your H800 chip. One of Depseek’s keys is called PTX In the recipe for the thrilling growth that Nvidia has experienced during the last five years, its GPU does not intervene; CUDA technology (Compute Unified Device Architecture) also has An essential role in your business. Most of the AI ​​projects that are currently being developed are implemented on CUDA. This technology brings together the compiler and development tools used by programmers to develop their software for NVIDIA GPUs, and replace it with another option in the projects that are already underway it is a problem. Huawei, who aspires to an important portion From this market in China, it has Cann (Compute Architecture for Neural Networks), which is its alternative to CUDA, but for the moment CUDA dominates the market. In addition, this Nvidia tool puts in the hands of programmers High level language that allows them to access the GPU hardware in an affordable way. Even so, and we reach the heart of this article, Deepseek engineers have not used Cuda to develop their AI: They have used PTX (PARALLEL THREAD EXECUTION). Deepseek engineers have decided to use PTX to get the most out of the H800 GPUs as possible This language is similar to the assembly. In fact, it is somehow the assembly that proposes the developers who use their GPUs and need to implement low level optimizations in their code. Programming with PTX is more difficult and laborious than doing it with CUDA, but it entails the advantage that it allows developers to write a more efficient code, and, therefore, capable of taking better advantage of the resources offered by the Hardware of the GPU. Presumably the Deepseek engineers have decided to use PTX to get the most out of the H800 GPUs they had in their possession. One of the stratagems they have devised has consisted of assigning only 20 SM (Multiprocessors streaming) From each GPU to the communication between the servers, which has allowed them Dedicate the remaining 112 From each chip to calculation processes. In essence, Deepseek has been built since zero by resorting to this type of optimizations, which largely explains why this AI model is so efficient. The programmers of this Chinese company have objectively materialized an achievement in the field of engineering that will in all likelihood will have a deep impact on the way in which AI models developers will face their projects in the future. This is the palpable evidence that China is successfully adapting to the shortage of GPUs that have triggered US sanctions in their companies. Image | Nvidia More information | Mirae Asset Securities Korea In Xataka | We can forget an AI without hallucinations for now. The general director of Nvidia explains why

Deepseek does the same as Openai’s most advanced models with much less resources. The key: “Reinforcement Learning”

The entire world is wondering how it is possible that the models of AI of Deepseek They have become overnight the great protagonists of today in the field of artificial intelligence. The answer is relatively simple. These models have managed to demonstrate that You can do more with much less. Both Deepseek V3 and Deepseek-R1 are comparable to GPT-4 or O1 OPENAI respectively, but it is estimated that their training has been much less expensive and its inference, of course, is: the prices of the Deepseek API are up to 35 sometimes lower than those of OpenAi, but that makes one wonder how it is possible. The answer is clear, and it is because we have at our disposal the technical reports of these AI models. Precisely his study has allowed us to clarify What are the techniques that this Chinese R&D laboratory has used to develop these models so efficient and capable. Many techniques, a single objective: efficiency There are several differences that make Deepseek’s new model especially efficient. Its creators explain in detail in the detailed Technical Report that is publicly available. Here are the most relevant: Deepseekmoe (“Mixture of experts”): In models such as GPT-3.5 the entire model was activated in both training and inference (when we use it). However, not all model components are necessary for our requests. The MOE technique – already introving with Deepseek V2 – precisely divides the model into multiple “experts” and only activates those that are necessary according to the request. GPT-4 is already a MOE model. But as we said, Depseekmoe even went further and differentiated between even more specialized experts, in addition to using some somewhat more generalist experts that could contribute value in certain requests. Managing all those specialized or generalist experts not only benefits inference, but also the training phase, making it more efficient. This technique is similar to the so -called “Time Scaling test” that also adjusts the size or complexity of a model during efficiency. Deepseekmla (Multi-Head Latent attention): It is another substantial improvement-even more than the previous one, and also introduced with Deepseek V2-that affects the way in which memory is managed in these models. Normally it is necessary to load both the model and the entire context window – the one that allows us to write prompts and include long texts, for example. Context windows are especially expensive because each token requires both a key and their corresponding value. With the improvement introduced with this technique, what was made possible was to compress that warehouse of keys and values, dramatically reducing memory use during inference. Auxiliary -los-Free Load Balancing: If we imagine a model like a great orchestra, each musician is an “expert” within the model. To play a complex piece, not all musicians are necessary all the time. Traditionally the so -called “auxiliary losses” were used to make sure that all musicians played enough, but these losses could interfere with that interpretation of the musical piece (model training), which could degrade general performance. With Deepseek V3 the model is able to balance the work of each expert dynamically. That does the simplest, direct and efficient training by eliminating “auxiliary losses.” In addition, the elimination of interference allows the model to learn better and with less resources … and get better results. Multi-Token Prediction Training Objective: Often predicting the following word depends on several previous words or context. With this technique instead of predicting only the following word, the model learns to predict several words at the same time. That makes more natural and understandable and less ambiguous texts generate, but also accelerates training by reducing the number of steps necessary to generate the complete text sequence. FP8 Mixed Precision Training: The use of Numbers FP8 allows significantly reducing memory consumption and accelerates calculations. Some critical parts of the model continue to use FP32 training to guarantee precision, but there is another additional benefit of FP8: the size of the models is reduced. Other models use techniques such as quantization or parameter pruning. Although Openai does not give data on GPT-4 in this section, the assumption is that it works with BF16, more expensive in terms of memory. Although FP8 theoretically leads to less precise models, other complementary techniques such as fine-grained quantization are used to reduce the negative impact of values ​​that come out of the common, which makes a stable training possible. Cross-Node All-to-Lall Communication: During training it is necessary to constantly exchange information between all nodes (computers) connected in training data centers. That can become a bottleneck, but these new Deepseek V3 techniques include efficient communication protocols, data traffic reduction and efficient synchronization to accelerate training and, once again, reduce the costs of that process. Reinforcement and “distillation” learning as keys But in addition to all these techniques, those responsible for Deepseek V3 explain how they pressed it with 14.8 billion tokens, a process to which a supervised adjustment followed (Superved Fine-Tuning, SFT) and several stages of Reinforcement Learning (Reinforcement Learning, RL). The SFT phase-which is mentioned in the Deepseek V3 report-was completely omitted in the case of Deepseek-R1. However, learning by reinforcement is an absolute protagonist in the development of both models, especially in R1. The technique is well known in the field of artificial intelligence, and it is as if we trained a dog with prizes and punishments. The model learns to respond better by giving rewards if you do well. Over time, the model learns to take actions that maximize long -term reward. In Deepseek, learning for reinforcement is used to break down complex problems in smaller steps. In it Deepseek R1 technical report It also indicates how this model makes use of RL techniques directly on the base model, without the need for supervised training. That saves computing resources. The call also comes into play here Thought chain (chain-of-though)also mentioned in the technical report. This refers to the ability of a language model to show the intermediate steps of its reasoning. The model not only … Read more

Deepseek has had to pull pure ingenuity, breaking the “more = better” paradigm

Satya Nadella, the general director of Microsoft, It is very clear: “Deepseek’s new model is really impressive both for how they have effectively develop a model of artificial intelligence (AI) open source which performs calculations in time of inference as for its incredible computational efficiency. We must take the developments from China very, very seriously (…) As IA becomes more efficient and accessible we will see that its use triggers, becoming a merchandise from which we cannot do without. “ In this statement to FortuneNadella gives credit to the technological triumph that the Chinese company Deepseek has reached. And he honors him that he recognizes him without ambiguity, especially if we are in mind that Microsoft is one of the competitors of the AI ​​industry that has just a few hours ago witnessed how Its value in the bag has fallen in an abrupt way after the emergence of Deepseek R1. Anyway, we can be sure that to a large extent this AI model is the result of the pressure that US sanctions are exerting on Chinese companies. Jensen Huang, the founder and general director of Nvidia, He anticipated it in one of the statements he made at the end of May 2023 in Computex: “China is dedicating mass resources to the implementation of emerging companies specialized in the development of GPU. Do not underestimate them.” This warning was aimed at the US government in a clear attempt to prevent you about the consequences that They will have the sanctions that seek to stop the technological development of China. Huang talks about GPU Chinese designers, but his statement can be extrapolated to Chinese companies that develop AI models. After all, in this area, the GPUs and the great language models go hand in hand. USA will continue to lead in AI A good part of the sanctions approved by the administration led by Joe Biden as of October 7, 2022 seeks to slow down the development of the Chinese semiconductor industry, and also its AI technology. In fact, as we have just seen, the integrated circuits and the AI ​​go hand in hand. These prohibitions prevent NVIDIA, AMD or Intel, among other chips manufacturers for AI applications, sell their most advanced GPU to their Chinese clients. This is presumably the germ of Deepseek’s greatest achievement. According to Depseek the infrastructure used to train its AI model 2,048 NVIDIA H800 chips If we stick to the information that this Chinese company has made the infrastructure used to train Depseek R1 agglutina 2,048 chips H800 of Nvidia. And training with 671,000 million parameters has cost 5.6 million dollars. This is precisely what Satya Nadella speaks in the statements that we have reviewed a few lines above. These figures are extremely restrained. Some analysts defend that, in reality, its infrastructure brings together 50,000 GPU H100 Buy through intermediaries, but for the moment it is just a conjecture. If we give the statements made by the Deepseek spokesmen to good Financial Timesand for the moment it is reasonable to do so, the reason why their engineers have mounted their training infrastructure on NVIDIA H800 GPUs is that US sanctions have prevented them from accessing the H100 chips, which are more powerful. The prohibitions of November 16, 2023 They prevent Nvidia Delivering to their Chinese clients the H800 GPUs, but presumably at that time Depseek already had its infrastructure assembled. In any case, at this situation the meritorious is that with a relatively modest chip this Chinese company has materialized a remarkable achievement. Depseek’s undisputed success is a victory for China, but it is a partial victory. This technological war at the moment is winning the US. Its advantage lies in an unappealable reality: the country led by Donald Trump controls so much Most GPU manufacturers Like many of the companies that are dedicated to developing AI models. And the latter have access without restrictions on the most advanced GPUs produced by NVIDIA and other companies. China has the Huawei GPU, which They seem to be very competitive In inference processes, and also with those of companies such as Moore Threads, Metax, Biren Technology, Innosilicon, Zhaoxin, Iluvatar Corex, Denglinai or Vast Ai Tech, among others. But, for the moment, it is in a position of clear disadvantage. Even so, this confrontation goes for long, so any conclusion that we reach about which country will finally impose itself in the AI ​​domain, if any, it would be premature. Image | Nvidia More information | Fortune | Financial Times In Xataka | China is closely monitoring the United States movement with Stargate. And your answer has already prepared

Deepseek is the fashion model. The problem is that nobody knows very well what you are doing with our data

If you are a user of Chatgptit is likely that in the last week you have tried Deepseek. He Chinese ia chatbot It is so promising that It has become a serious threat to many of the American technological giants. Not only is it an open source proposal that can work very well at home (in a certain hardware), but also has a free online version that, at least for the moment, allows unlimited consultations. There is also one Payment API whose rates are very competitive. When we use chatbots like Deepseek we usually share a lot of information. These tools have become allies when planning vacations, summarizing documents, making budgets, analyzing images, among other things. Their language models swallow every word that we introduce to give us the answers we are looking for or an approach that we can refine to Prompts. What do they do with our data? Now, after the initial crush of discovering a new application of ia, let’s ask ourselves What happens to our data. Do they disappear after being processed by the model? Do we give them perpetuity to a company that we don’t even know? Are they stored as a treasure for training future iterations of the model? Certainly there are many questions, many, but they are not completely new. We already asked ourselves many of these questions when Chatgpt achieved popularity. Perhaps we did not ask ourselves the questions, but several European regulators were asked, which forced the company led by Sam Altman to make some changes to continue operating in certain countries of the block. Deepseek is the new star, and sooner or later these questions should appear on the scene. In the midst of so many questions there are some certainties: Deepseek collects a huge amount of data. This is probably not a surprise for some, but for others who have just started using chatbot. To have more clarity about the PRIVACY PRACTICES OF THE CHINISH COMPANY we can consult its page of Privacy Policy. Let’s start at the beginning, Hangzhou Deepseek Artificial Intelligence Co., Ltd. and Beijing Deepseek Artificial Intelligence Co., Ltd. collect information on the profile of users, such as username, date of birth (if applicable), address of email and/or telephone number and password. Our chats also collect, that is, texts, audios, charged files, comments, history. Everything goes to these companies. Suppose you have a question about Depseek and use the contact paths to talk to them. Well, the aforementioned organizations will also collect all the information you send. From identity or age tests, comments or consultations about the service. Everything mentioned so far is within a category called “Information you provide.” Within the range of information collected by companies behind Depseek we find another category called “automatically collected information.” Here they are made with our device model, operating system (and the language of the), IP address, cookies and diagnostic data and performance. They will also capture the keys of keys, and everything will be associated with a device ID and a user ID. It is not possible to quote our data, but there are no doubt that they are valuable. One way to measure its value is to size everything they drive. First, Depseek uses the data collected to train your AI models. Companies also talk about “interactions supervision”, and at this point we are not sure if there are humans analyzing conversations. In the documents of the companies we find other interesting information, such as that “review the entries and exits of the user and other information to protect the security and well -being” of the community. They also collect data to comply with legal obligations, to “carry out public interest tasks” and to notify changes in services. Later we will see where the data of the millions of Deepseek users are stored. The data collected by Depseek does not stay in Deepseek Deepseek says in its privacy policy that you can share the information collected from all categories indicated above. We go in parts. First, we can mention corporate group entitiesthat is, actors that are under the umbrella of organizations that control Deepseek. But there is more. They can also be sent to “advertising or analytical partners.” Possibly remember that paragraphs above we point out that the information collected is labeled with a device ID and a user ID. Well, these identifiers are usually very useful to track user activity and cross it with that of other platforms. At this point Depseek mentions that you can use activity from other sites and services, but in some jurisdictions. It is not clear how this will be applied in the European Union. Deepseek also explains that they can share the information collected with “the agencies responsible for enforcing the law, Public authoritiesthe head of copyright or other third parties if we believe in good faith that is necessary. ” In other words, they can give this data to the government. While this occurs in almost any jurisdiction, we must pay special attention to China, which has been involved in several controversies in this regard. There are numerous investigations that point against the Chinese Communist Party (PCCH) and Laws of the People’s Republic of China for forcing the technology companies of their country to provide data to relevant information. A document of United States National Security Department points out that the Government urges companies to install rear doors To help in operations to maintain national security. One of Tiktok’s conflict points It was precisely the one we just mentioned. The data of the Americans were apparently exposed to foreign actors. To address this concern, Bytedance, the social network matrix, An agreement with Oracle arrived To store the data of the United States users in their territory, also submitting them to legislation the North American country. The data collected by Depseek is stored on servers in China. Besides, We have spoken a lot about how strict the regulation towards technological ones in the European Union. The General Data … Read more

What is Deepseek, the new Chinese artificial intelligence that defies Chatgpt

Since its irruption in November 2022, Chatgpt marked a before and after in the field of generative artificial intelligence. His ability to create content, maintain natural conversations and solve complex tasks inspired giants such as Google, Meta and X (formerly Twitter) to develop their own versions. However, A new competitor from China, called Depseek, has begun to capture global attention. This development not only rivals US proposals, but could redefine the rules of the game in the AI ​​sector. What is Deepseek and why is it relevant? Deepseek is a generative artificial intelligence model with capabilities similar to those of its counterparts such as Chatgpt and Gemini. Designed to create content, reason structured and offer advanced solutions, This AI was recently launched in its R1 version. According to a report by The Wall Street Journal, Depseek has already positioned itself as a competitive model thanks to its technical performance and efficiency. In performance tests (benchmarks), China managed to match, and even overcome in certain aspects, to Openai’s GPT-O1 model, released in September. This higher performance, combined with a considerably lower cost, makes Deepseek an attractive option for companies and developers that seek advanced solutions without a high disbursement. Deepseek has several advantages that make it extremely attractive to developers. (Photo: Shuttersock) 5 Deepseek keys that make a difference 1) Advanced and efficient performance Deepseek is not only positioned as a competitive alternative, but demonstrates its ability to match reference models in the sector. Its design has been optimized to perform complex tasks with high precision, placing itself as one of the most promising options in the market. 2) Open source: Transparency and adaptability Unlike the models developed by Openai and Google, Depseek is open source. ANDSto means that your infrastructure is available for experts, developers and users to analyze, customize and improve the model. This characteristic encourages collaboration and innovation, offering flexibility that closed models do not allow. 3) Competitive costs One of Deepseek’s most outstanding points It is its low cost compared to its US rivals. This economic approach seeks to democratize access to advanced AI, allowing more companies, especially emerging ones to take advantage of their potential. 4) Deepthink: A star function Deepseek includes a tool called Deepthink, which uses reasoning chains to emulate cognitive human processes. This allows you to address complex problems more structured and with more precise results. This function has been one of the most prominent in its initial tests. 5) National and international collaboration The model has been developed by a team that combines talents from China technical universities and a background led by Liang Wenfeng, co-founder of High-Flyer. This collaboration, together with the commitment to a low -cost approach, strengthens the competitiveness of the model in the global field. Technological rivalry between China and the US The appearance of Deepseek revives the technological rivalry between the US and China, especially in the AI ​​sector. Historically, the US has led this market, with companies such as OpenAi, Google, Microsoft and Anthropic at the forefront. However, China has intensified its efforts in recent years, investing significantly in the research and development of AI. The Stargate project, an initiative announced during Donald Trump’s second term, has allocated more than $ 500,000 to the development of advanced technologies in artificial intelligence In the US in this context, Depseek emerges as a Chinese response to this supremacy, demonstrating that the Asian giant has the capabilities to compete at the highest level. The rivalry between Chatgpt and Deepseek is a reflection of what is currently living between China and the US (Photo: Shuttersock) The arrival of Deepseek not only represents a new challenge for large American companies, but also opens new opportunities for developers and users interested in exploring more accessible and customizable tools. In terms of business adoption, its open source and lower costs could allow Deepseek to consolidate in markets where competition has been dominated by closed models. However, its long -term success will depend on factors such as the quality of its technical support, constant updates and their ability to maintain high standards of safety and privacy, especially in a global context marked by geopolitical tensions. Deepseek not only represents a technological advance, but also a declaration of intentions by China in the field of artificial intelligence. With a strategy based on collaboration, transparency and accessibility, this model could transform the current panorama of generative AI. However, the true test for Deepseek will be its performance in practical applications and its ability to compete in markets dominated by giants such as OpenAi and Google. In addition, the international perception of the model will also play a crucial role, especially in countries that cautiously observe technological developments from China. In short, Depseek not only expands the range of options in generative artificial intelligence, but also revives the debate on the role of technology in international relations and global innovation. Are we facing the beginning of a new era in technological rivalry between the US and China? Only time will say it. Continue reading: * 3 simple ways to get a new job, according to specialists* Mental challenges that only humans can overcome before artificial intelligence* How artificial intelligence could help us improve our health (tagstotranslate) chatgpt

We knew that US Big Tech had a problem with the costs of their AI. DeepSeek has just shown to what extent

DeepSeek is the new darling of AI. This family of models, developed by a Chinese R&D laboratory of the same name, has achieved what seemed impossible: compete with the OpenAI or Meta models and do so, according to them, at a much lower cost. Is that true? A development 18 times cheaper than GPT-4. The Chinese startup released DeepSeek V3 671B at the end of December 2024. Its gigantic model was trained in just two months with a budget of 5.58 million dollars according to SCMP and analysts cited in Financial Times. Its performance is comparable to OpenAI’s GPT-4, but the latter cost about $100 million to develop according to Sam Altman. That’s almost 18 times more if we take into account both the data revealed by SCMP and Altman’s estimates. Comparative cost of the main chat and reasoning models today. DeepSeek’s price is incredibly lower than its competitors. Data: DeepSeek, OpenAI, Anthropic, Meta. Amazingly cheap. The cost of DeepSeek’s API is incredibly low when compared to its competitors. If we take the data from DeepSeek, Goal, OpenAI, Google and Anthropic It seems to be clear that the cost of using DeepSeek through its API is much lower than that proposed by its rivals. We have included the cost of GPT-4o mini which seems to be the only one comparable, but its performance is much lower than DeepSeek V3. DeepSeek V3 is superior to most of its competitors, although it is true that Meta has released for example Llama 3.3 in recent days and that comparison varies frequently. And it is (theoretically) superior to all. As they point out on RedditDeepSeek V3 prices are promotional: starting February 8 they will be $0.27 per million input tokens (almost double) and $1.10 per million output tokens (almost four times more) . This makes the comparison somewhat better for the competitors, especially for Llama, the only one that can compete in cost although the Chinese model is superior to that of Meta (and almost also to the rest in many metrics) according to the benchmarks carried out in DeepSeek. DeepSeek also “thinks” cheaper. The cost comparison is not only in favor of DeepSeek in the area of ​​traditional chatbots, but also in the area of ​​reasoning models. According to its internal benchmarks, the spectacular DeepSeek R1 It is significantly superior to OpenAI’s o1, but using the o1 API costs 27 times more than that of DeepSeek R1. Hallucinatory. Price drop in sight. As expert Ethan Mollick points out, the market will adjust to these DeepSeek-driven price drops fairly quickly. According to their estimates, the cost of a GPT-4 level AI was reduced 1000 times in 18 months, and a 95% drop in the price of the reasoning models, which right now are clearly higher than the AI models behind ChatGPT, for example. a chinese tsunami. The launch of the DeepSeek models is a great little revolution for all types of developers of AI-based solutions: they now have access to much cheaper models that are comparatively equal to or superior to those of the competition. This puts their rivals in a lot of trouble, and we will see how they react. Good news for users. The truth is that for us, the users, as well as for the developers, this is great news, especially because these prices make access to these functions incredibly cheaper. The market has been following this trend clearly, but DeepSeek has made the jump in cost reduction suddenly drastic. Image | Xataka with Freepik Pikasso In Xataka | OpenAI prepares a PhD-level AI. It is so promising that he will first show it to the US Government

Log In

Forgot password?

Forgot password?

Enter your account data and we will send you a link to reset your password.

Your password reset link appears to be invalid or expired.

Log in

Privacy Policy

Add to Collection

No Collections

Here you'll find all collections you've created before.