There are too many AI models. That raises a true death sentence for Anthropic and Claude

We have AI models to bore. And the problem is that everyone starts looking too close and deciding which one is better not simple. All companies and startups strive to be referents in an absolutely unleashed market. One that as in other technological wars probably ends some winners and enough losers. And there are those who compete with clear disadvantages. Another colossal investment round. In The Wall Street Journal indicate That Anthropic is about to close a new financing round that would allow him to lift 3.5 billion dollars. That would make the company’s assessment amount to 61.5 billion dollars, and the question is whether the company really has options in such a competitive market. “This is not a real company”. According to analyst Ed Zitron, Claude has Two million active monthly users in January 2025. It also talks about how according to the WSJ projected revenues for 2025 (based on current contracts) is 1.2 billion dollars, a very modest figure. “They also lost 5.6 billion dollars last year,” Sign it. According to his opinion, Anthropic “is not a real company, they could not survive without the beneficence of risk capital.” Fierce competition. The truth is that Anthropic is facing exceptional competition in which the large heavyweights of the Tech industry are both in the US and in China. Deepseek surprised all of them with the launch of Deepseek V3 and after Deepseek R1, and that seems to have encouraged investors to bet even more money through all these companies. OpenAI is still a reference. At least, it is in number of users. According to CNBC They already have 400 million of active users every week, an exceptional figure that clearly puts them at the head of the popularity ranking in this segment. As with Claude, Openai is burning money that he does not have and that they obtain from extraordinary financing rounds, but unlike this, we insist, the popularity of Chatgpt is evident. And the big ones have what matters now: money. For many users IA is chatgpt, and giants such as Google with Gemini, Microsoft with Copilot or Meta with flame are still far from achieving that acceptance. They have something that Anthropic (or perplexity) does not have: many, many funds – Grok 3, from Xai is another example – and can be maintained in this race even if that is costing them a lot of money. The prize is too fat not to chase him. There are too many models, some can stay on the road. In all technological wars there have been winners and losers. It is the same as what this battle for AI points, in which there are too many competitors and that it probably ends up causing some of these efforts to not survive. Here Anthropic is one of those at a disadvantage. The AI ​​winner can be a company still unknown. Openai, Google, Apple or Microsoft may be especially well positioned to win that race, but it does not have to be so. As they recently indicated In axiosnew company can arise, still unknown, that end up doing something differential and what none of the greats had thought. It is not easy, but of course it is not impossible. Remembering Netscape. In the second half of the 9th Internet began to show their potential, but the great A small company called Netscape He managed to become a reference in the world of browsers. Then it would end up being the great loser of that war, but it was the demonstration that having more money and resources does not always have to have all the options. And that’s why so much investment in startups. That possibility that the one that wins the race will be an unknown company is precisely the one that makes risk capital companies investing a lot of money in projects that may not get absolutely at all. It has recently occurred with Thinking Machines Labthe Startup of Mira Murati, or with Safe Superintelligencethat of Ilya Sutskever. None of them have a product to show, but still have already received spectacular investments. And be careful, there is also China. Of course there are formidable rivals that are not in the US. Mistral is a reference in Europe, while In China another particular war is being fought which has made today the models of the AI ​​of Chinese companies are so good (or sometimes, better) than those of the US. The winner of this battle could also come from that country. Or any other, of course. Image | Saradash Pradhan In Xataka | China has an ambitious plan to overcome the West in Technology. And he has already chosen his 18 companies to get it

If you have never bought a 3D printer because they are very expensive, much eye at these five offers in a good variety of models

3D printers are not usually especially cheap and there are many things that we must take into account if we are going to buy our first printer. But there are brands that make it a little easier by having a good amount of models that are aimed at a more casual or more expert user. Anycubic has now launched a new campaign and has many of these printers, so we have chosen five models with a good discount. Anycubic Photon Mono 4 by 144 eurosan economic printer that lowers price with the Diynew15 coupon. Anycubic Photon Mono M7 by 279 eurosa printer a little larger than lowers price with the DiyNew20 coupon. Anycubic Kobra 3 Combo by 354 eurosa printer to print in various colors that lowers price with the Diynew25 coupon. Anycubic Kobra 2 Max by 374 eurosa printer that offers good speed and lowers price with the DiyNew25 coupon. Anycubic Kobra S1 Combo by 569 eurosan ultra -grape printer and good size with the DiyNew30 coupon. Anycubic Photon Mono 4 If what we are looking for is only to start in the world of 3D printers with an affordable model, the Anycubic Photon Mono 4 has dropped from 229 euros to the 144 euros With the coupon DIYNEW15. Includes a seven -inch screen, print with a maximum print volume of 153.4 x 87 x 165 mm and has a system to resume printing in case you have a light cut. In addition, as we see in the rest of the printers of this list, the results are quite surprising. You can see some samples on the 3D printer file. * Some price may have changed from the last review Anycubic Photon Mono M7 If we are going to start at 3D printers, but we are looking 279 euros With the coupon DIYNEW20. This printer offers a greater speed than the previous one, also includes a screen, but in this case 10.1 inches and offers a maximum print volume of 223 x 126 x 230 mm. The brand also has some samples in the printer description. * Some price may have changed from the last review Anycubic Kobra 3 Combo On the other hand, if what we want is to have a printer that is capable of printing in color, but that its price does not rise too much, the Anycubic Kobra 3 Combo is an interesting purchase option, especially now that it has dropped from 599 euros to the 354 euros With the coupon DIYNEW25. It allows you to print with four to eight colors with a maximum print volume of 250 x 250 x 260 mm and in the description of the product we can find samples offered by both the brand and users. * Some price may have changed from the last review Anycubic Kobra 2 Max If we want a good printer because we have already used one or because we simply want to make large volume impressions, the Anycubic Kobra 2 Max has dropped from 659 euros to 374 euros With the coupon DIYNEW25. It allows printing with a maximum volume of 420 x 420 x 500 mm, has automatic leveling and is much faster than the previous models. We can also find samples offered by the brand and by users in the printer description. * Some price may have changed from the last review Anycubic Kobra S1 Combo Finally, if what we are looking for is a complete printer and that allows you to print in color, the Anycubic Kobra S1 Combo has also dropped in price from 749 euros to the 569 euros With the coupon DIYNEW30. It offers good speed, it is silent, it allows you to print with a maximum volume of 250 x 250 x 250 mm with up to eight colors and has automatic leveling. In this case, we can only find samples offered by the brand in the printer description. * Some price may have changed from the last review Some of the links of this article are affiliated and can report a benefit to Xataka. In case of non -availability, offers may vary. Images | Anycubic In Xataka | Best domestic 3D printers. Which to buy, resources for 3D printing and eight recommended models In Xataka | This is what I would have liked to know before I started in the 3D printing world

These will be the keys to their next generation of AI models

In a world where competition intensifies with actors such as Deepseek, OpenAi prepares to launch great changes. The news has been released by Sam Altman himselfwhich has recognized some complications in its current product offer. The launch of GPT-4.5 and GPT-5 Its objective is to improve the experience of users for the future. How we pointed out last monththe nomenclature of Openai’s artificial intelligence (AI) models had become complex and confusing, a situation that ended up aggravating itself with the arrival of reasoning models. To try to put some order, the startup introduced the models selector in Chatgpt for some time, but this feature does not seem to have helped too much. Goodbye Model selector, Hello GPT-4.5 and GPT-5 The novelties that will be released from the new OpenAi roadmap are rough. The first of them bears GPT-4.5 name. This product, called Orion by OpenAI employees, will be the last product of the company “without a chain of thought” And, attention to this point, because after this launch, whose date has not been announced, an important unification will arrive. “From there, we will focus on unifying the models of the series or and those of the GPT series, developing systems that integrate all our tools, know when a longer reasoning is necessary and are useful for a wide variety of tasks,” Altman said. This will translate into the launch of GPT-5, which will integrate much of Openai’s technology, including O3. While the startup presented O3 family In December last year, only the O3 mini version, focused on performance, reached the public. The full version of O3, which is as high as human programmers in certain tests, finally will not be available as an independent model. We can use it through chatgpt and API for developers through GPT-5. In development. Images | OpenAI In Xataka | The companies of AI have been jumping the copyright for years. They have just suffered a disturbing legal defeat

The best free tools to install models of AI as Deepseek, call, mistern, gemma and more

We bring you a list with the best free tools for Install artificial intelligence models locallyand thus create your own chatgpt with models such as Deepseek, Callsand more. These are open source models, which means that you can install and use them for free on your computer. Installing an AI locally has disadvantages such as using less powerful versions of these models. However, it has important advantages such as that all data are left on your computer and are not compiled by any company, and that you can use it for free. In this article we have tried to focus only on the eight best programs to do this. However, if you think we have left any list that you consider important, we suggest that Tell us in the comments so that the rest of the users can benefit from your knowledge. Ollama Ollama It is an open source application without graphic environment that you can install both in Windows and Macos and GNU/Linux. What it offers is the possibility of install and use AI models from the terminal of your computer, without complications and without having to open extra apps. This program will allow you to install a large number of models, from the flame to Depseek, Phi, Nomic, Qwen and many more. Each model has different versions, both complete and distilled, and you have the possibility of lowering them with different parameter sizes. LM Studio An open source application that serves to lower LLM models of artificial intelligence on your computer. Offers a unified graphic interface, since you can search and lower AI models within the program with a search engine and in a simple way, and then lower them and throw them in it. This program has versions for Windows, Macos and GNU/Linux, and allows you to use the models in your IU or a local server compatible with OpenAi. You can also use local documents with AI, you will use the models without connection, and you can download them from Hugging Face repositories. You can use models as a flame, Mistral, Phi, Gemma, Qwen or Deepseek Anythingllm A program all in one to be able to use artificial intelligence models on your computer, locally and offline. It is open source, and allows you to chat with documents, execute AI agents and manage various tasks. In addition, if your computer is not very powerful, it has subscriptions to use it from the cloud. It has a very flexible architecture, with three components working together, and in addition to being able to use AI models with open source connect locally to privatesuch as Openai, Azure and others services. It focuses mainly on privacy and customization, having many available controls. GPT4ALL Another open source project to install LLM models on your computer, being able to work on the CPU or the GPU. It has the capacity to install up to 1,000 open source languor models, such as Deepseek R1, Llama, Mistral, Nous-Hermes and many more. It is a payment application, although with a ratuita version with limited tokens. But for daily use it should be enough. It has programs for Windows, Windows ARM, Macos and Ubuntu. Jan An open open source program that allows you to install open source models locally, such as Call, Gemma or Mistral. It also allows you to connect to cloud services such as OpenAi or Anthropic when you need it. All data is stored locally. It has versions for Windows, Macos and GNU/Linux, being compatible with the GPUS NVIDIA (CUDA), AMD (Vulkan) or Intel Arc. Has an extensions system That will allow you to customize it and configure it to your liking. The interface is light and beautiful. Flame.cpp An open source program created to use locally Any flame -based model of finish. This program can work both in the CPU and in the GPU of your computer, which allows it to be better in domestic equipment, although it is a bit more complex to use. NextChat NextChat allows you to use the chatgpt characteristics in an open source package that is under your control. It is a web and desktop application that connects directly to external AI services, such as Google, Openai or Claude, but storing the data locally in the browser. This program also allows users to create “masks”, something similar to GPT with which to create IA tools with specific contexts and configurations. It can work in Windows, Macos and GNU (Linux. Flamefile A program that converts AI models into executable filesso that you can use them independently. It is a Mozilla Builders project, which combines flame.cpp with Cosmopolitan Libc. It is compatible with Windows, GNU/Linux, Macos and BSD. In Xataka Basics | Prompts pages: 16 free websites and communities to find ideas for your prompts and find advice to improve them

Meta emails reveal that he downloaded 81.7 TB of books with copyright via Bittorrent to train their AI models

In the legal process Kadrey against goal Mark Zuckerberg’s company is accused of having used works protected by copyright to train their artificial intelligence models. A few weeks ago it was revealed that Zuckerberg had approved to use pirate booksbut now new and powerful evidence of this looting arrive. Revealed emails. He “Appendix a“The case includes several mail email messages from the finish Do that data collection in October 2022. “Download with torrents from a company’s laptop does not seem a good idea”. In April 2023 Nikolay Bashlykov, one of those responsible for carrying out this data collection, joking including emojis and indicated that the company would have to be careful with the IP from which they downloaded the data. Goal knew the risks. In September of that year Bashlykov already stopped using emoticons and warned that using torrents would imply acting as “seeds” so that others also download them, and “that might not be legally legally.” These debates are proof that Meta knew that this type of activity was illegal, according to the authors who have sued the company. Erasing the footprints. In a Internal message Meta Frank Zhang researcher indicated how the company avoided using its servers by downloading this data set to “avoid” “the risk that anyone can draw the seed” and who downloaded that data. 81.7 TB of data. As they point out In Ars TechnicaThe evidence shows that Meta downloaded at least 81.7 the terabytes of data from various libraries offered by those books protected by copyright. In a New document The legal process indicated that at least 35.7 TB had been downloaded from sites such as Z-Library or Libgen (which It ended up closing last summer). Goal wants to dismiss those charges. Goal has presented a motion to dismiss those accusations indicating that there was no evidence that any book was downloaded by finishing employees through Torrent or that they were later distributed by goal. In Xataka we have contacted the company, and we will update this news if we receive comments on the case. Loot on the Internet fire. These data affect the debatable practices that AI companies are using to train their models. We saw it With Googleand of course also with Openai, who used millions of texts to train Chatgpt, and Many of them had copyright. Perplexity was in the spotlight after discovering that He skipped the bullfighter Internet rules to avoid payment walls and feed your AI model. Internet robberies are being normalized. The amazing thing about all this is that the fact that all companies are skipping the norms and violating copyright seems to be normalizing the looting of the Internet. It almost does not give time to scandal and we give it almost as a policy of consummate facts to be able to follow ours. Is this really a “fair use”? All companies are shielded in the concept of “fair use” (“Fair Use”). This concept developed in Anglo -Saxon law allows the limited use of protected material without being necessary to ask for permission to do so. Copyright rapes have not stopped arriving in the world of generative AI, but they seem to be in the background while these giants thrive. In Xataka | 5,000 “tokens” of my blog are being used to train an AI. I have not given my permission

All their rivals offer free models that “reason” and Gemini 2.0 is the last example

All the companies and startups of AI in the United States were so quiet going to their own. And suddenly Deepseek R1 arrived and became a true existential threat to Silicon Valley. The Chinese startup offered a model of reasoning as good as that of its competitors, but also offered it for free (and Open Source!). What has Silicon Valley did? Apply the story, of course. GEMINI 2.0 Razon Free for All. Enough that you visit the Official Gemini website and display the “Gemini” menu from the upper left to check it. You can already use 2.0 Flash Thinking Experimental (its reasoning model) both in normal mode and in “collaborative” mode with services such as YouTube or Maps. And it is totally free. Microsoft Copilot and Think Deper. Microsoft Copilot’s “Think Deper” mode is also available for free In this service of the company. As we explainThink Deper is actually OpenAi O1, but before Microsoft had to pay the subscription of Copilot Pro ($ 20 per month) to enjoy access to that option. The appearance of Deepseek R1 caused it to also offer it in a grauita way (although with a more limited number of consultations). OPENAI O1. The company led by Sam Altman didn’t want to be left behind and less than a week ago presented O3-minia reasoning model that in addition to being especially powerful is available in the Grauita version of chatgpt. We can activate the “Reason” button so that when we ask something, the O3 reasoning capabilities are put into operation. Deepseek R1 and perplexity. Perplexity’s search engine is gradually offering new options. In fact, a few days ago those responsible announced that On the perplexity website We could activate the Reasoning-R1 model based on Deepseek R1, but housed in the US (to avoid suspicions with possible data theft). They even give the option of opting for the Reasoning-O3-mini model, which is the same offered in Chatgpt. Again for free (although limited), but that stands out for being a comfortable way to try Deepseek R1 in its most powerful version. And the rest? This first batch of reasoning models seems to have taken on foot changed to the rest of the great contenders in the AI ​​segment. Anthropic, who is still a reference with Claude, has not launched a reasoning model at the moment. He has not done so Apple, who goes to his own pace. Meta has not launched anything in this regard despite offering a flame as a clear reference of the Ia Open Source model. And Elon Musk seems to be very busy, because Xai is still working In Grok And for the moment there is no news about a potential variant of reasoning. The only remarkable alternative for the moment is Doubao-1.5-Prothe reasoning model fresh by Bytedance, although it is not available as simple as its competitors. The competition benefits users. The impact of Deepseek R1 on the AI ​​segment has been spectacular as we see. When Openai launched O1 In September 2024 he did it by raising him as a very advanced option but also face: only the subscribers of his services could access it in a limited way. Four months later we are using models that rival O1 but that are totally free and that we can use with more and more options. They are great news for users, which at least for now are benefiting from all that rivalry between these companies. The AI ​​that reasons every time is better and cheaper (or free). A graph Created by Shawn Wang (@swyx) and published in his Newsletter, Latent Space, shows a clear evolution of AI models. In that graph you see how its capacity (measured at LMSYS points, a well -known ranking of AI models) is confronted with its cost per million tokens (ratio 3: 1 entry: exit). Here the right and the right is a model, the better, and Gemini 2.0 Flash Thinking seems to be especially well positioned, but this type of graph is changing very quickly. Again, more good news for us, users. In Xataka | Mistral AI is the French startup that opted for efficiency before Deepseek. His future is uncertain

What are distilled artificial intelligence models and LLM distillation

We will try to explain in a simple and understandable way What are distilled models When we talk about artificial intelligence. When we talk to you about Install Depseek on the computer We mentioned that there were distilled versions, and other AIs are also being created that are distilled versions of other specific models. We also usually refer to it as LLM distillationto specify that we refer to Large linguistic models either Large Language Modelwhich are those capable of processing the text, understanding what we write and responding to text. Come on, like Chatgpt , Deepseek, COPILOT, Gemini either Grok. What is LLM distillation The distillation of artificial intelligence models is A technique to reduce the size of the modelsreplicating the results and performance you can get with them. Although we are used to using them through applications and web pages, LLM models They consume a lot of space and resources. We do not usually notice because when you use an AI from a website or app, you connect to the servers of large companies where this model is running. But if you wanted to have a complete model installed on your computer, you would need a very powerful processor and a lot of space. The solution to this problem is to create a distilled model, A model trained to occupy less space. This model can replicate most of the performance, but it will be smaller and fast, you will need less resources to work. The way to do it is similar to a teacher and a student. The complete model is a teacher who shares his experience and knowledge with a student, transmitting complex concepts and knowledge. Meanwhile, the student model learns to imitate what is being taught in a simpler and more effective way. With that, lighter models are achieved. Your results will never be so good like those of the teacher, but the main and performance characteristics will remain. Come on, which comes to be a Lite version, a small but light and versatile version. There are different techniques To create distilled models, such as knowledge distillation with final results for the student model to know the decision -making process or use the teacher to generate additional training data. It is also distilled in intermediate layer so as not to transfer only final results but intermediate layers, or use several teacher models to train the student. In general, private companies that create artificial intelligence models are also responsible for creating distilled versions. The normal thing is that a specific name is added to the distilled version, such as the “flash” of Google Gemini or “Mini” of OpenAi. In other cases, especially In open source modelsThey can use the name of the master model for the distillate but adding as a last name the models that have been used as a student. Come on, you can take a smaller model like Qwen and use it to create a distilled version of Deepseek that is called Deepseek qwen, or Deepseek distill qwen, to indicate that it is distilled. Pros and cons of distilled models A complete artificial intelligence model has billions of parameters, and the quantity of space and computer power To execute them it is huge. In a domestic computer you will need technology and tip power, in addition to a lot of space, already level of a companies such as OpenAi or Google that offer their AI by web or app, you need many resources on their servers. Therefore, creating distilled models helps reduce size and occupy less space. But it also allows them to work faster, and that less computational costs are necessary. That makes Google or OpenAi offering you Free “small” versions Of its main models, leaving the most complete for payment users. Because keeping the complete requires money and investment. And if we are talking about an open source model, have distilled versions allows you and I can install them and use them on our computer without having to spend thousands of euros on a new processor, on graphics cards or internal storage. These techniques can also be used to create artificial intelligence models at a lower cost than would involve complete training. For that, you take already created models and train to a new one from their data and their knowledge, and you do not have to perform the process from scratch. However, distilled models do not have the same amount of data and parameters, they are often lower in resources, and More failures and hallucinations may arise. I will give you an example. If you follow our guide to Install Depseek on the computeryou will see that at a certain point you have several versions. You have versions 8bversions 14bor the full version of 671b. This number refers to its characters, and the lower the less resources you need, but more distilled and small will be the model. Therefore, in this example, if you install an Deepseek 8b and a 14B, you will see that The lower model has more hallucinations And it gives you less precise answers. Therefore, the better you have the greater the model will have to be the model, and less distilled it will have to be. The same goes for commercial models. If you are using a 2.0 flash gemini, the results will be worse than the full Gemini 2.0, and the same with the OPENAI O3 and O3 mini. However, the Flash or Mini version is the one offered to all free users, while the complete is for payment users, in order to assume the cost of maintaining these models in operation. In Xataka Basics | Prompts pages: 16 free websites and communities to find ideas for your prompts and find advice to improve them

Mediamarkt has models for less than 240 euros in its outlet

Yes, the iPhone are expensive, but it is already something that we see in virtually any mobile brand. Fortunately, outlet stores are perfect to have some good, beautiful and, above all, cheap models at home. One of the best is that of Mediamarkt, specifically the one found on eBay: They have the store guarantee and mobiles are in good condition. In this article we leave you with three iPhone at a very good price and that are updated with the latest version of iOS (in Applesfera you can know How many updates do the iPhone have). iPhone 11 If we look for the most economical, the best purchase option is the iPhone 11. It is located in the Mediamarkt outlet for a price of 239 euros And it is an open and unused article (exposure), with small brands on the screen. Comes with a screen 6.1 -inch LCD IPS and has Qi wireless load, as well as the last update of the operating system. * Some price may have changed from the last review iPhone 12 On the other hand, if we want to take a generational leap we can also find a good price in the iPhone 12. In this case, Mediamarkt has it on eBay for 299 euros And it is also an open and unused article (exposure). Mount a screen Super Retina XDR OLED of 6.1 inches And here we already find wireless load by means of Magsafe. This mobile still has a year of updates, so it is a good purchase option. * Some price may have changed from the last review iPhone 13 Mini And if we make another generational jump again, but in this case we bet on the most compact model instead of the standard, we find a good price in the iPhone 13 Mini. Mediamarkt has it reduced in your outlet by 399 euros And it is an open article, without using (exposure) with the scratch screen. Its screen is smaller, since we talk about a panel Super Retina XDR OLED of 5.4 inches. It also has Magsafe and this mobile still has a couple of years of updates. * Some price may have changed from the last review Some of the links of this article are affiliated and can report a benefit to Xataka. In case of non -availability, offers may vary. Images | Jose GarcíaApple In Xataka | Best iPhone. Which to buy and recommended models based on budget, tastes and quality price In Xataka | The best price quality price. Your analysis and videos are here

What is Ollama and how to use it to install artificial intelligence models on your computer

Let’s explain What is Ollama and how it works This application with which you can Install Depseek on your computeras well as other artificial intelligence models such as Llama, Phi, Mistral, Qwen, Llava or Gemma. These models will be installed locally so that you do not have to go to any website to use them, and so that everything you do stays on your computer. We are going to start the article explaining what exactly Ollama is, and everything you need to know about this program. Then, we will tell you how to install it on your computer, and then how to start using it to install an artificial intelligence model and use it. What is and how Ollama works Ollama is a program that you can install on any computer, both with Windows operating system and with macOS or GNU/Linux. It is a client of artificial intelligence models, so It is the basis on which then install an AI That you want to use. Ollama has two peculiarities. The first is that It allows you to use an AI locally. This means that instead of going to the chat page with artificial intelligence of a company, the model is installed on your computer and you use it directly without entering any website. This favors you in three ways. First because The data of everything you do stay on your PCso that no company uses them. Second because You can use AI No connection to the Internet, and third because you can skip census that has an artificial intelligence model that you are using on a website. Of course, what you can’t is to searches online. And the second particularity is that It works through your computer terminalor the system symbol in Windows. This does not have to use a separate application. When I install Ollama, then you will have to use the console of your device to install and execute in it the model you want, and the questions and the prompts you write them in the console, where you will also have the answers. How to install Ollama To start using Ollama, the first thing you have to do is Install your computer application. For that enters its website Ollama.comand click on the button Download That will appear. Now, you will go to the page where you have to Choose the operating system for which you want to lower the program. Once you have chosen it, click on the button Download. By default the web will show the system you are using, but you can download the executable of any other. When you download it, it launches the installation program. Install Ollama is very simple, you just have to click on the next button on the presentation screen, and then click on the button Install On the installation screen. How to use Ollama Once you have installed Ollama, launches the application. You will see that nothing happens, this is because you have to open your computer terminalwhich in Windows is called the system symbol. Now, before you start you have to go to the web where you will see All available AI models. The web is Ollama.com/search. Click on one of the list models To enter your card. Here you will have all the information, and you will also have a selector to choose the different versions available. When you choose a version, above the right you have the necessary command to install it and launch it in Ollama. Now, you just have to Use the command to launch the model in your terminalsuch as Ollama Run Deepseek-R1: 8b To launch the 8B version of Deepseek R1. The first time you use the command first the model will be installed, but the next ones you will already throw it directly. Remember that to do this at the terminal of your computer you must first have launched the ollama application. And after writing the command, The AI ​​model will be launched in the terminal. This is to distinguish why you see that in the terminal’s writing field now appears >>>which means that what you write will be sent to the artificial intelligence model. Now, in your computer’s command line you can Write the prompt you want Throw you to the AI ​​you have chosen, and after a few seconds the answer will begin to generate. In Xataka Basics | Prompts pages: 16 free websites and communities to find ideas for your prompts and find advice to improve them

Deepseek does the same as Openai’s most advanced models with much less resources. The key: “Reinforcement Learning”

The entire world is wondering how it is possible that the models of AI of Deepseek They have become overnight the great protagonists of today in the field of artificial intelligence. The answer is relatively simple. These models have managed to demonstrate that You can do more with much less. Both Deepseek V3 and Deepseek-R1 are comparable to GPT-4 or O1 OPENAI respectively, but it is estimated that their training has been much less expensive and its inference, of course, is: the prices of the Deepseek API are up to 35 sometimes lower than those of OpenAi, but that makes one wonder how it is possible. The answer is clear, and it is because we have at our disposal the technical reports of these AI models. Precisely his study has allowed us to clarify What are the techniques that this Chinese R&D laboratory has used to develop these models so efficient and capable. Many techniques, a single objective: efficiency There are several differences that make Deepseek’s new model especially efficient. Its creators explain in detail in the detailed Technical Report that is publicly available. Here are the most relevant: Deepseekmoe (“Mixture of experts”): In models such as GPT-3.5 the entire model was activated in both training and inference (when we use it). However, not all model components are necessary for our requests. The MOE technique – already introving with Deepseek V2 – precisely divides the model into multiple “experts” and only activates those that are necessary according to the request. GPT-4 is already a MOE model. But as we said, Depseekmoe even went further and differentiated between even more specialized experts, in addition to using some somewhat more generalist experts that could contribute value in certain requests. Managing all those specialized or generalist experts not only benefits inference, but also the training phase, making it more efficient. This technique is similar to the so -called “Time Scaling test” that also adjusts the size or complexity of a model during efficiency. Deepseekmla (Multi-Head Latent attention): It is another substantial improvement-even more than the previous one, and also introduced with Deepseek V2-that affects the way in which memory is managed in these models. Normally it is necessary to load both the model and the entire context window – the one that allows us to write prompts and include long texts, for example. Context windows are especially expensive because each token requires both a key and their corresponding value. With the improvement introduced with this technique, what was made possible was to compress that warehouse of keys and values, dramatically reducing memory use during inference. Auxiliary -los-Free Load Balancing: If we imagine a model like a great orchestra, each musician is an “expert” within the model. To play a complex piece, not all musicians are necessary all the time. Traditionally the so -called “auxiliary losses” were used to make sure that all musicians played enough, but these losses could interfere with that interpretation of the musical piece (model training), which could degrade general performance. With Deepseek V3 the model is able to balance the work of each expert dynamically. That does the simplest, direct and efficient training by eliminating “auxiliary losses.” In addition, the elimination of interference allows the model to learn better and with less resources … and get better results. Multi-Token Prediction Training Objective: Often predicting the following word depends on several previous words or context. With this technique instead of predicting only the following word, the model learns to predict several words at the same time. That makes more natural and understandable and less ambiguous texts generate, but also accelerates training by reducing the number of steps necessary to generate the complete text sequence. FP8 Mixed Precision Training: The use of Numbers FP8 allows significantly reducing memory consumption and accelerates calculations. Some critical parts of the model continue to use FP32 training to guarantee precision, but there is another additional benefit of FP8: the size of the models is reduced. Other models use techniques such as quantization or parameter pruning. Although Openai does not give data on GPT-4 in this section, the assumption is that it works with BF16, more expensive in terms of memory. Although FP8 theoretically leads to less precise models, other complementary techniques such as fine-grained quantization are used to reduce the negative impact of values ​​that come out of the common, which makes a stable training possible. Cross-Node All-to-Lall Communication: During training it is necessary to constantly exchange information between all nodes (computers) connected in training data centers. That can become a bottleneck, but these new Deepseek V3 techniques include efficient communication protocols, data traffic reduction and efficient synchronization to accelerate training and, once again, reduce the costs of that process. Reinforcement and “distillation” learning as keys But in addition to all these techniques, those responsible for Deepseek V3 explain how they pressed it with 14.8 billion tokens, a process to which a supervised adjustment followed (Superved Fine-Tuning, SFT) and several stages of Reinforcement Learning (Reinforcement Learning, RL). The SFT phase-which is mentioned in the Deepseek V3 report-was completely omitted in the case of Deepseek-R1. However, learning by reinforcement is an absolute protagonist in the development of both models, especially in R1. The technique is well known in the field of artificial intelligence, and it is as if we trained a dog with prizes and punishments. The model learns to respond better by giving rewards if you do well. Over time, the model learns to take actions that maximize long -term reward. In Deepseek, learning for reinforcement is used to break down complex problems in smaller steps. In it Deepseek R1 technical report It also indicates how this model makes use of RL techniques directly on the base model, without the need for supervised training. That saves computing resources. The call also comes into play here Thought chain (chain-of-though)also mentioned in the technical report. This refers to the ability of a language model to show the intermediate steps of its reasoning. The model not only … Read more

Log In

Forgot password?

Forgot password?

Enter your account data and we will send you a link to reset your password.

Your password reset link appears to be invalid or expired.

Log in

Privacy Policy

Add to Collection

No Collections

Here you'll find all collections you've created before.