usatoday24

The models of AI of Deepseek They are really good. They show it comparative evidence we publish yesterday and that put it at the level of chatgpt, Claude or Gemini. That has unleashed praises, but also suspicion. There are people who do not believe that training deepseek It has cost just 5.6 million dollarsbut also now in Openai they accuse Depseek of something else.

Deepseek, you are using our data without permission. Openai spokesmen have indicated Financial Times They have discovered evidence that “distillate” techniques have been used from OpenAi models used by Depseek.

What is that of “distilled” in AI? Yesterday we talk about how Depseek developers have used a large number of techniques to achieve such an efficient model. Among them stands out for reinforcement learningbut it is also known that they use models distillate. In this technique a smaller “student model” is taught to behave as a larger and more advanced “teacher model”. Data of the “teacher model” are used so that the small model is faster and more efficient, but equally intelligent in specific tasks.

The price to be paid for having ia is the looting of all the internet content. And perplexity is just the last example

Use not allowed. The distillate or distillation of models is a common practice in the industry, but the terms of OpenAi service prohibit that their models be used for this purpose. Thus, it is specified that users cannot “copy” none of their services or “use the output (of Openai models) to develop models that compete with Openai.”

OpenAI and Microsoft have already investigated this. According to Bloombergboth companies analyzed last accounts that were being used to take advantage of their chatbots and that apparently belonged to Deepseek developers. They used Openai’s API, but there were suspicions that they had violated the terms of service by taking advantage of that access to make distillate of their models.

Many do. David Sacks, responsible for AI in Donald Trump’s team, alerted him to what was happening and said there was evidence that Depseek had used OpenAi data. Spokesmen of the company led by Sam Altman indicated that “we know that companies of the People’s Republic of China – and others – are constantly trying to distill the models of leading companies in AI in the US.”

The thief is believed that everyone is of his condition. The ironic thing here is that Openai has not had scruples when collecting internet data to train their models, also violating the terms of service of those platforms. Last year it was discovered for example how transcribed a million youtube hours To train GPT-4. Timnit Gebru, famous for his controversial dismissal From Google, I commented on LinkedIn that Openai “must be the most insufferable company in the world.” And he continued: “They can steal the entire world and swallow all possible resources. But no one can give them their own medicine not even a bit.”

If you are on the Internet, it can be used, right? Other companies They do exactly the sameand are shielded in the argumetno of “fair use.” They collect Any public content On the Internet without asking users or permission or platforms. Not only that: it is suspected that in many cases these models are trained with works Protected by copyrightsomething that has resulted in numerous demands.

Image | Xataka with Grok

In Xataka | The next phase of AI is not to see who invests more but who invests less

Leave your vote

0 Points

Upvote Downvote

OpenAi has taken everything he wanted from the Internet to train his AI. Now accuses Depseek of stealing his data

Leave your vote

Leave a CommentCancel reply

Leave your vote

Leave a CommentCancel reply

Log In

Sign In

Forgot password?

Your password reset link appears to be invalid or expired.

Log in

Privacy Policy

Add to Collection

No Collections