Many video AIs are learning to imitate the world. And everything points to an unprecedented “looting” of YouTube

A square, tourists, a waiter moving between tables, a bike passing by in the background or a journalist on a set. Video AIs can now generate scenes in a flash. The result is surprising, but it also opens up a question that until recently was barely posed: where did all those images that have come from come from? allowed to learn to imitate the world? According to The Atlanticpart of the answer points to millions of videos pulled from platforms like YouTube without clear consent. The euphoria over generative AI has moved so quickly that many questions have been left behind. In just two years we have gone from curious little experiments to models that produce videos almost indistinguishable from the real thing. And while the focus was on the demonstrations, another issue was gaining weight: transparency. OpenAI, for example, has explained that Sora is trained with “publicly available” data, but has not detailed which one. A massive workout that points to YouTube The Atlantic piece gives a clear clue as to what was happening behind the scenes. We are talking about more than 15 million videos collected to train AI models, with a huge amount coming from YouTube without formal authorization. Among the initiatives cited are data sets associated with several companies, designed to improve the performance of video generators. According to the media, this process was carried out without notifying the creators who originally published that content. One of the most striking aspects of the discovery is the profile of the affected material. These were not just anonymous videos or home recordings, but informative content and professional productions. The media found that thousands of pieces came from channels belonging to publications such as The New York Times, BBC, The Guardian, The Washington Post or Al Jazeera. Taken together, we are talking about a huge volume of journalism that would have ended up feeding AI systems without prior agreement with their owners. runwayone of the companies that has given the most impetus to generative video, is highlighted in the reviewed data sets. According to the documents cited, their models would have learned with clips organized by type of scene and context: interviews, explanatory, pieces with graphics, kitchen plans, resource plans. The idea is clear: if AI must reproduce human situations and audiovisual narratives, it needs real references that cover everything from gestures to editing rhythms. Fragments of a video generated with the Runway tool In addition to Runway, the research mentions data sets used in laboratories of large technology platforms such as Meta or ByteDance in research and development of their models. The dynamic was similar: huge volumes of videos collected on the Internet and shared between research teams to improve audiovisual capabilities. YouTube’s official stance doesn’t leave much room for interpretation. Its regulations prohibit downloading videos to train modelsand its CEO, Neal Mohan, has reiterated it in public. The expectations of the creators, he stressed, involve their content being used within the rules of the service. The appearance of millions of videos in AI databases has brought that legal framework to the fore and has intensified pressure on platforms involved in the development of generative models. The reaction of the media sector has followed two paths. On the one hand, companies like Vox Media o Prisa have closed agreements to license their content to artificial intelligence platforms, looking for a clear framework and economic compensation. On the other hand, some media outlets have chosen to stand up: The New York Times has taken OpenAI and Microsoft to court for the unauthorized use of their materials, stressing that it will also protect the video content it distributes. The legal terrain remains unclear. Current legislation was not intended for models that process millions of videos in parallel, and courts are still beginning to draw the lines. For some experts, publishing openly is not equivalent to transferring training rightswhile AI companies defend that indexing and the use of public material are part of technological advancement. This tension, still unresolved, keeps media and developers in a constant game of balance. What we have before us is the start of a conversation that goes far beyond technology. Training AI models with material available on the internet has been a widespread practice for years, and now comes the time to decide where the limits are. Companies promise agreements and transparency, the media ask for guarantees and creators demand control. The next stage will be as technological as it is political: how artificial intelligence is fed will define who benefits from it. Images | Xataka with Gemini 2.5 In Xataka | All the big AIs have ignored copyright laws. The amazing thing is that there are still no consequences

The greatest fear was that AI took our work. The reality is that they are replacing those who are learning to work

In just a few weeks, thousands of students from generation Z will graduate in their respective careers with the hope of get that first job that allows them to start a professional career. However, the current panorama is increasingly complicated for those who seek their first job opportunity. The Difficulty accessing the labor market Not only does it come for the structural part of an economy shaking by Commercial uncertainty. The AI ​​is already beginning to assume tasks that until now carried out the Practices in practices or Junior employees, which traditionally served as an initial step for the newly graduates. The staircase has broken. As explained Aneesh Raman, responsible for Economic Opportunities of LinkedIn In an article for The New York Times“The first thing that is being broken is the lowest step of the professional staircase”, a trend that threatens to leave many young people without the possibility of learning and growing in their first years of career. As He predicted Fortune In 2024, the AI ​​already is doing your jobso there is no reason for companies to hire them. Currently, AI acts mainly as a support tool to lighten the workload of senior employees by automating routine tasks. In great technology, Like AmazonGoogle or Microsoft, are implemented to help their developers do administrative or generation tasks small code fragments. Tasks that until now were part of the learning process of Junior employees. The Z generation is already noticing it. Chris Hyams, CEO of Indeed, said in the Fortune‘S Workplace Innovation Summit that: “The good news is that there is no job that AI can perform all the skills required. This does not mean that it will not replace the workers, but the AI ​​cannot replace a complete job.” However, Hyams said that “in approximately two thirds of all works, 50% or more of the skills required are things that the current generative AI can do reasonably well, or very well.” This means that, although AI does not eliminate all positions, it does reduce the need to perform many basic tasks that used to train new employees. Youth unemployment data in the US. The implantation rhythm of AI in the business field is being faster in the United States than in other countries. According The published by The Atlantic, The New York Federal Reserve He warnedof an increase in unemployment among the newly graduates standing at 5.8% and 6.2% for younger workers. That is, something is happening in the entrance to the labor market. Although there is still no definitive evidence that AI is the only cause of the weakening of this link in the professional career of the youngest, the trend is clear: the opportunities to learn working from below are disappearing in the sectors where AI is being implemented faster, as companies Like Duolingo or Shopify They are adopting policies that reduce hiring options of the youngest to do those routine tasks if there is the possibility that they can make them an AI. Ask for qualified employees, but not form them. In parallel to the creation of this training wall imposed by automation, companies from all over the world denounce a growing shortage of qualified labor. According to recent European Union data collectedby Euronewsthe lack of experienced workers and advanced skills has become one of the main challenges for the economy. This phenomenon creates a paradox: as long as young people do not find opportunities to train in companies from their simpler processes, companies fail to cover their vacancies with qualified personnel. If this trend is consolidated, in a few years AI will not have left millions of workers without employment. In reality, it will have been the companies themselves who would have prevented the training of qualified personnel and with the necessary experience for supervise the work of AI that is developing. In Xataka | Founders of small startups and large technological ones already has something in common: they are millmillonarios thanks to the AI In Xataka | “Humans will not be necessary for most things”: Bill Gates does not believe that doctors and teachers have a future Image | Unspash (Mushvig Niftaliyev)

We have been creating whale bones to tools for a long time. Before even learning to hunt them

For centuries, whale hunting was a weight sector in the coastal areas of the Gulf of Bizkaia. Everything took advantage of this animal, or almost everything: the meat served as food and fat served as oil to enliven the flames of the lamps before electricity and oil. His bones have also been A valuable resource Throughout history. Now we know that also during prehistory. Prehistoric tools. A group of researchers, including scientists from the Institute of Environmental Science and Technology of the Autonomous University of Barcelona, He has discovered Tools made of whale bone. The analysis of De Ha throws an estimated age of between 19,000 and 20,000 years. The 83 tools were found In various deposits Distributed by the coast of the Gulf of Bizkaia, including the Cantabrian coast and points in southern France. To these tools you have to add another 90 unrelated bones found in the Cave of Santa Catalina, Located in the Biscay town of Lekeitio. The bone remains would have belonged to specimens of at least five different species, including species such as the sperm whale, the common whale and the blue whale that can still be found in the waters of the Gulf and the gray whale, already disappeared from that environment with a more restricted habitat to areas of the northern Pacific and the Arctic Oceans, explains the team Investigating what and when. For the identification of the species and dating the tools, the equipment resorted to the mass spectrometry techniques and radiocarbon dating. Thus they managed to find the origin of the tools in the five species mentioned above. It was also like the team determined that it was, In words of the Jean-Marc Pétillon group “some of the oldest known evidences of human use of whale remains as tools.” “Zooms is a very powerful technique to investigate the past diversity of marine mammals, especially when there are missing diagnostic morphometric elements in bone remains and objects, something common in artifacts made of bones,” explained in a press release Krista McGrath, co -author of the study. Chemical analysis. The study also involved a chemical analysis of the sample. Thanks to this, the team was able to find out data on the eating habits of the whales, which “differed slightly from those of their modern counterparts.” This implies possible changes in the behavior of cetaceans, or in the marine ecosystem itself. The details of the study were published In an article In the magazine Nature Communications. 20,000 years hunting whales? The conclusion that the human being has 20 millennia hunting whales is tempting, but the team responsible for the study considers it “extremely unlikely.” The most likely hypothesis is that the Pleistocene hunters would have taken advantage of the arrival of stranded whales to the coast to obtain their bones and manufacture from them “It is extremely unlikely that these species would have been accessible to the hunters collecting European pelistocene in another way that were not through passive acquisition methods, such as the opportunistic acquisition of stranded whales or the arrival on the coast of corpses,” says the article. “There is no evidence (…) that the collection hunters of the European Pleistocene had the necessary technologies to hunt these species, such as navigation (…).” Change of sea level. Study coastal life in glacial ages since the present coastline is relatively Far from the coastline typical of the last glaciation, although the extension of the lands flooded after the end of the last glacial era differs between area and area. Within the Gulf of Bizkaia, for example, we can appreciate a greater area flooded on the French coast and therefore a greater decline of the coastal line in contrast to the Cantabrian coast. In any case, the coastal recession implies the loss of valuable coastal deposits now flooded by the Atlantic waters. Deposits that could hide countless data that could help us to know better these life modes of the millennium coastal peoples behind in time. In Xataka | The history of the last whale hunted in Spain, on October 21, 1985 Image | ICTA-UAB/Alexandre Lefebvre

The thermal expansion has been a headache for centuries. Now we are learning how to dodge it

Heat tends to make the materials expand and gain volume, a volume that is then reduced when the temperature drops. This is a problem for architects and engineers since this effect is very noticeable in metals such as steel. What if we could avoid this problem? A new alloy. A group of scientists has created a new alloy that barely shows thermal expansion along a wide temperature fork. The key to development has been in invar, an alloy with similar properties that has previously been deciphered. 100 years of mystery. Invar is an alloy composed of iron, nickel and other elements with an extremely low thermal expansion coefficient, that is, an alloy that is barely dilated to an increase in temperatures. In a fork that covers more than 400 k (that is, more than 400º Celsius), invaria only expands 0.0001% of its length for each degree Celsius (or for each Kelvin). This alloy was created at the end of the 19th century by Charles Édouard Guillaume, who I would receive the Nobel Prize in Physics In 1920 “for its discovery of anomalies in the steel and nickel alloys.” We have needed a century since the award concession to begin to understand the underlying science in this low thermal expansion. Thermal expansion. The phenomenon of thermal expansion is an old acquaintance. As those responsible for the new work explain, this phenomenon is the result of the same movement of atoms (remember that the temperature is nothing other than that). When the atoms get hot, they move more and that makes more space need, then the material expands. This phenomenon, they continue to point out, is inevitable, but understanding it in detail opens the door to create new materials that somehow balance this effect. To study it, the team resorted to computer simulations that allowed analyzing the behavior of magnetic materials on tiny scales. “This allowed us to better understand the reason why invaria is hardly expanding,” said Segii Khmelevskyi, co -author of the new study. The effect is due to changes in the state of the electrons that occur as the temperature increases. These changes counteract “almost exactly” the thermal expansion of the material, adds Khmelevskyi. From theory to practice. Knowing the theory opens the way to the creation of new alloys capable of overcoming thermal expansion. It is precisely what the study responsible for the study did, put their findings into practice. And the result is what they have called Pyrocloro magnet. The new alloy combines more than two compounds: zirconium, niobium, iron and cobalt. “It is a material with an extremely under thermal coefficient above a range of unprecedented temperatures,” says Yili Cao, development co -author. “The effect is because certain electrons change its status as the temperature increases. The magnetic order of the material decreases, which makes the material contract, ”explains CAO. This effect is precisely analogous to the one seen in invar. The secret is irregularity. The team explains that the marked of the effect is also to the fact that the Pyroclloric magnet does not have a perfect network structure, that is, with the atoms arranged forming a regular and repeated pattern, but more heterogeneous. Some areas contain more or less cobalt which makes the material expand and contract in an almost identical proportion. Development details were published In an article In the magazine National Science Review. In Xataka | Cheaper, durable and ecological: a new material with the help of ruthenium wants to change the rules of green hydrogen Image | Your Wien

Deepseek does the same as Openai’s most advanced models with much less resources. The key: “Reinforcement Learning”

The entire world is wondering how it is possible that the models of AI of Deepseek They have become overnight the great protagonists of today in the field of artificial intelligence. The answer is relatively simple. These models have managed to demonstrate that You can do more with much less. Both Deepseek V3 and Deepseek-R1 are comparable to GPT-4 or O1 OPENAI respectively, but it is estimated that their training has been much less expensive and its inference, of course, is: the prices of the Deepseek API are up to 35 sometimes lower than those of OpenAi, but that makes one wonder how it is possible. The answer is clear, and it is because we have at our disposal the technical reports of these AI models. Precisely his study has allowed us to clarify What are the techniques that this Chinese R&D laboratory has used to develop these models so efficient and capable. Many techniques, a single objective: efficiency There are several differences that make Deepseek’s new model especially efficient. Its creators explain in detail in the detailed Technical Report that is publicly available. Here are the most relevant: Deepseekmoe (“Mixture of experts”): In models such as GPT-3.5 the entire model was activated in both training and inference (when we use it). However, not all model components are necessary for our requests. The MOE technique – already introving with Deepseek V2 – precisely divides the model into multiple “experts” and only activates those that are necessary according to the request. GPT-4 is already a MOE model. But as we said, Depseekmoe even went further and differentiated between even more specialized experts, in addition to using some somewhat more generalist experts that could contribute value in certain requests. Managing all those specialized or generalist experts not only benefits inference, but also the training phase, making it more efficient. This technique is similar to the so -called “Time Scaling test” that also adjusts the size or complexity of a model during efficiency. Deepseekmla (Multi-Head Latent attention): It is another substantial improvement-even more than the previous one, and also introduced with Deepseek V2-that affects the way in which memory is managed in these models. Normally it is necessary to load both the model and the entire context window – the one that allows us to write prompts and include long texts, for example. Context windows are especially expensive because each token requires both a key and their corresponding value. With the improvement introduced with this technique, what was made possible was to compress that warehouse of keys and values, dramatically reducing memory use during inference. Auxiliary -los-Free Load Balancing: If we imagine a model like a great orchestra, each musician is an “expert” within the model. To play a complex piece, not all musicians are necessary all the time. Traditionally the so -called “auxiliary losses” were used to make sure that all musicians played enough, but these losses could interfere with that interpretation of the musical piece (model training), which could degrade general performance. With Deepseek V3 the model is able to balance the work of each expert dynamically. That does the simplest, direct and efficient training by eliminating “auxiliary losses.” In addition, the elimination of interference allows the model to learn better and with less resources … and get better results. Multi-Token Prediction Training Objective: Often predicting the following word depends on several previous words or context. With this technique instead of predicting only the following word, the model learns to predict several words at the same time. That makes more natural and understandable and less ambiguous texts generate, but also accelerates training by reducing the number of steps necessary to generate the complete text sequence. FP8 Mixed Precision Training: The use of Numbers FP8 allows significantly reducing memory consumption and accelerates calculations. Some critical parts of the model continue to use FP32 training to guarantee precision, but there is another additional benefit of FP8: the size of the models is reduced. Other models use techniques such as quantization or parameter pruning. Although Openai does not give data on GPT-4 in this section, the assumption is that it works with BF16, more expensive in terms of memory. Although FP8 theoretically leads to less precise models, other complementary techniques such as fine-grained quantization are used to reduce the negative impact of values ​​that come out of the common, which makes a stable training possible. Cross-Node All-to-Lall Communication: During training it is necessary to constantly exchange information between all nodes (computers) connected in training data centers. That can become a bottleneck, but these new Deepseek V3 techniques include efficient communication protocols, data traffic reduction and efficient synchronization to accelerate training and, once again, reduce the costs of that process. Reinforcement and “distillation” learning as keys But in addition to all these techniques, those responsible for Deepseek V3 explain how they pressed it with 14.8 billion tokens, a process to which a supervised adjustment followed (Superved Fine-Tuning, SFT) and several stages of Reinforcement Learning (Reinforcement Learning, RL). The SFT phase-which is mentioned in the Deepseek V3 report-was completely omitted in the case of Deepseek-R1. However, learning by reinforcement is an absolute protagonist in the development of both models, especially in R1. The technique is well known in the field of artificial intelligence, and it is as if we trained a dog with prizes and punishments. The model learns to respond better by giving rewards if you do well. Over time, the model learns to take actions that maximize long -term reward. In Deepseek, learning for reinforcement is used to break down complex problems in smaller steps. In it Deepseek R1 technical report It also indicates how this model makes use of RL techniques directly on the base model, without the need for supervised training. That saves computing resources. The call also comes into play here Thought chain (chain-of-though)also mentioned in the technical report. This refers to the ability of a language model to show the intermediate steps of its reasoning. The model not only … Read more

Octopus tentacles have their own “brain.” We are now learning the implications

Octopuses are invertebrate animals, but the absence of a central nervous system like that of birds or mammals does not make their brains less interesting than the rest. Brains, emphasizing the plural since neuronal systems of each of its extremities They have a degree of independence, which leads many to consider them as such. A nervous system not at all central. Now, a group of researchers has studied the nervous systems of these cephalopods to better understand how these nine neural organs operate together and to what extent they maintain their independence. What they observed is that each of these brains had the ability to operate individually. The team responsible for the new study believes that it is thanks to the unique segmentation of the nervous system of octopuses that these animals achieve the level of skill in the management of extremely flexible organs that serve these animals to move, feed, sense their environment, and even copulate. “If you are going to have a nervous system that is going to control such dynamic movement, that is a good way to organize it,” explained in a press release Clifton Ragsdale, co-author of the study. “We think it’s a feature that evolved specifically in soft-bodied cephalopods with suction cups to carry out these worm-like movements.” Studying segmentation. The new study focused on segmentation of this curious neuronal system, analyzing the distribution and function of the neurons of these tentacles, taking as reference an octopus of the species Octopus bimaculatus. Neurons that together add up to a greater number than the neurons located in the “central brain” of the animal, which is responsible for coordinating actions that require the use of various arms. These neurons in the extremities are concentrated, explains the teaminto an axial nerve cord (ANC), which “snakes” the tentacle connected to each of the animal’s suction cups. Neural columns. The ANC analysis showed that neurons in the octopus’s limbs were grouped into “columns” that in turn formed segments that the team compared to corrugated pipes. The segments were in turn separated by gaps called “septa” from which nerves and blood vessels made their way to the muscles of the limb. “From a modeling perspective, the best way to organize a control system for this long and flexible arm would be to divide it into segments,” Cassady Olson added.co-author of the study. “There must be some kind of communication between the segments, which you can imagine attenuates their movements.” Job details can be found in an article published in the magazine Nature Communications. Much to investigate. The tentacles of octopuses are very versatile limbs that allow this animal to navigate the seabed, but also, through their suction cups, they allow these octopods to perceive the world around them, hunt and feed on their prey. Knowing the details of the functioning of such complex limbs will still require new research. In Xataka | Octopuses are not aliens, and scientists have had to come out to explain why Image | Theasereje, CC BY-SA 4.0

Log In

Forgot password?

Forgot password?

Enter your account data and we will send you a link to reset your password.

Your password reset link appears to be invalid or expired.

Log in

Privacy Policy

Add to Collection

No Collections

Here you'll find all collections you've created before.