US agents denounce that it is failing in a key point

Social networks have been using automated systems for years to try to detect some of the most serious crimes that circulate on the internet. Among them is child sexual exploitation, a phenomenon that forces platforms, regulators and security forces to monitor enormous volumes of content every day. The promise of these tools is clear: identify potential cases sooner and make the work of agents easier. However, some specialized teams in the United States maintain that the volume of notices they receive from Meta platforms has skyrocketed and that a significant portion of them do not provide useful information for action. Clash between scale and utility. In a lawsuit underway in New Mexico, prosecutors maintain that Meta did not adequately disclose what it knew about the risks minors face on its platforms and that it violated state consumer protection laws. According to the Associated Pressthe indictment also argues that the company presented the safety of its services in a way that did not correspond to the risks faced by children and adolescents. The case is part of a broader wave of lawsuits filed in the United States against large technology companies for the effects their services may have on minors. Meta rejects that interpretation. In his speech before the jury, the company’s lawyer Kevin Huff defended that the company has reported the risks associated with the use of its services and that it has introduced different tools to detect and eliminate harmful content. According to the Associated Press, Huff insisted that the central point of the case is not to prove that problematic content exists on social networks, but rather to determine whether the company hid relevant information from users. Researchers on the front line. Those who have provided figures and concrete examples of this problem are agents who work directly in investigations of child exploitation on the Internet. In the United States, those tasks fall largely to the network of units known as Internet Crimes Against Children (ICAC), a program that brings together police forces at different levels and is coordinated with the Department of Justice to investigate and prosecute crimes committed against minors in digital environments. Its agents receive notices about possible cases from different sources, including the technology platforms themselves. During the trial, some of these agents have described how they are experiencing the increase in ads from Meta platforms. Benjamin Zwiebel, ICAC special agent in New Mexico, explained in court that many of the notices they receive are of little use in advancing an investigation. “We get a lot of advice from Meta that is just garbage,” he declared, according to The Guardian. His words reflect a broader concern within these units: the volume of alerts has skyrocketed, but not all of them contain the information necessary to identify a suspect or initiate police action. Poor quality. In some cases, reports sent from the platforms include data that does not describe criminal conduct. In others, they do point to a possible crime, but they arrive without essential elements to continue the investigation, such as images, videos or fragments of conversations that allow those responsible to be identified. Without this material, agents have few tools to advance the case or request new proceedings. Some agents have also noted that a portion of these notices arrive with incomplete or partially removed information. The mass reporting machinery. Behind this increase in notices there are several factors that help to understand why the volume of reports sent to the authorities has skyrocketed. In the United States, technology companies are required by law to report any child sexual abuse material they detect on their services to the National Center for Missing & Exploited Children (NCMEC), an organization that acts as a national center for receiving these notices and subsequently distributes them to the corresponding police forces. Agents cited by The Guardian also point to recent legal changes, such as the Report Act, which came into force in November 2024, as a possible factor that would have increased the number of notices sent to avoid non-compliance. Meta says he’s doing the opposite.. The company rejects the idea that its systems are making the work of the authorities more difficult and maintains that, on the contrary, it has been collaborating for years with security forces to detect and prosecute this type of crime. A Meta spokesperson stated that the United States Department of Justice has recognized on several occasions the speed with which the company responds to requests from authorities and that NCMEC has positively evaluated its notice notification system. According to the company, in 2024 it received more than 9,000 emergency requests from US authorities and resolved them in an average time of 67 minutes, a process that, it claims, is accelerated even more when it comes to cases related to child safety or the risk of suicide. Meta also notes that it reports to NCMEC any material that may be linked to child sexual exploitation and that it works with that organization to help prioritize the notices, including by labeling those it considers most urgent. a real problem. Regardless of what the jury in New Mexico determines, the case reflects a tension that goes beyond a single company or a single state. Digital platforms operate on a global scale and use automated systems to detect illicit content in volumes that would be impossible to review manually. However, the experience described by some agents shows that increasing the number of tips does not always translate into more effective investigations. Images | Dima Solomin | ROBIN WORRALL In Xataka | Dario Amodei founded Anthropic because OpenAI didn’t take the risks of AI seriously. Now you are going to give in to those risks

AI agents have indeed changed work and the economy forever. But for now only in one sector: programming

AI agents are beginning to demonstrate their capabilities, but the only area in which they do so is programming. An Anthropic report reveals how software engineering is where half of the activity of AI agents is currently concentrated, and that proves two things. The first, that AI can effectively enhance work. The second, that there is a huge opportunity for hundreds of verticals where AI has barely landed. what has happened. If there is a sector that has embraced AI and AI agents, it is programming. Platforms like Cursor or WindSurf first and like Claude Code, OpenAI Codex or Antigravity today have made all kinds of people —whether they know programming or not— can turn their projects into reality in a really simple way. It’s a clear case of how AI can contribute to a field, but there’s a problem: it’s practically the only case where it has actually done so. Distribution of requests to AI tools by segment. Software engineering is almost responsible for 50% of those calls or requests, at least in the case of the Claude platform. Source: Anthropic. Verticals with a lot of margin. As can be seen in this graph, the presence of AI agents is very reduced or practically non-existent in a large number of verticals in which it is evident that there is a notable opportunity to take advantage of these tools. The automation of office tasks is the second main protagonist with 9.1% of the function calls of the Anthropic AI model in this report. Below it we find segments such as marketing, sales, finance, business analysis or scientific research. And others who are ignoring AI. There are quite a few sectors in which AI agents seem to be barely present. The travel, legal, medical, e-commerce or education segments seem perfect to start taking advantage of these tools, but at the moment this is not the case and this presence is very, very small in all of them. Claude Code can work longer and longer. Double what it was three months ago, in fact. Source: Anthropic. Models can now work autonomously for a long time. In these scenarios it is true that the models used to be limited by the time they could function autonomously and “chain” actions and self-analyze progress to continue acting. That’s not so true now. Claude Code, for example, has doubled the time of his longest sessions in just three months: from 25 minutes in October 2025 to 45 minutes in January 2026. And they need less human intervention. Another of the revealing data of the study is that the evolution of these agents not only means that they can function autonomously for longer periods of time, but that this also implies fewer human interventions. Those situations in which an agent “needs human help” to continue with the process are becoming limited. In August 2025, the average was 5.4 human interventions per session. In December that average dropped to 3.3 interventions. We trust more and more in AI. At Anthropic they have also noticed a unique behavior among users: they are increasingly trusting AI agents. In programming, novices approve each new step before it is executed, but veterans delegate and intervene when something goes wrong: they have gone from pre-approving everything to exercising active and constant monitoring. As they say at Anthropic“Users develop confidence as they work with the model, and change their monitoring strategy based on that growing confidence.” From programming to other fields. What has happened with programming could happen in other scenarios. The challenge is to build AI agents that adapt to each segment using that specific data from said vertical. If an AI wants to help in the legal segment, it must be specifically trained for that segment. What the AI ​​did when trained with thousands of code repositories on GitHub It was learning and improving. Well, the same can be applied to other verticals, although the challenge is certainly notable because programming was a perfect segment for the application of AI: it is very deterministic. It either works or it doesn’t, and whether it does or not, execution logs allow you to fine-tune that operation. The new unicorns await. As entrepreneur Garry Tan points out in your newsletterin the last two decades SaaS platforms have managed to capture 40% of venture capital investments and that industry has more than 170 unicorns. “The thesis is simple,” Tan concludes, “all of those unicorns have an equivalent in the form of vertical AI waiting.” Promises and realities. The AI ​​agent segment therefore promises many changes in a multitude of segments, but the reality is that today the practical success (there is no economic success at the moment) of AI is limited to the world of programming. Will we be able to transfer it to other segments? The opportunity is there, but it is one thing to say it, and quite another to do it… even if it is with AI. Image | Joshua Reddekopp In Xataka | Every time Facebook had a competitor, it bought it: it is exactly the same thing that OpenAI is doing

The best AI agents that are faster and easier to use to do tasks for you without complications or long installations

Let’s tell you the best fast and easy AI agents to use, without complicated installations and configurations. This type of AI agent They are less complete and powerful than the more complete and advanced ones, but they allow you to explore how the artificial intelligence can do tasks for you. We are going to make a small list to stick to the best alternatives. Many are quite popular, others are more unknown, and we even ended up with an open source alternative for privacy lovers. Claude Cowork Claude Cowork It is possibly the best and simplest tool to test the benefits of a medicine, but in a controlled way. It is a paid feature that you can use within the desktop application of Claude. The price to use it starts at 15 euros per month. Claude Cowork allows Claude’s AI to manage files and use applications on your computer. You tell him what you want, and Claude will find the best way to do it. Also, if you install the extension Claude in Chrome in your browser, Cowork will also be able to do things for you in the browser. Perplexity Comet Comet is the browser with artificial intelligence Perplexitya platform that started as a search engine based on artificial intelligence, and now it is much more. It is now a chatbot that allows you to use various artificial intelligence models, such as Gemini, GPT or Claude. The Comet browser has the peculiarity that can use AI to do tasks for yousuch as browsing you, interacting with websites, automating tasks, searching and filtering information, managing workflows and other tasks such as comparing prices on multiple pages. Manus on Telegram Manus is an autonomous AI agent, to which you can give a high-level objective and it works on its own to achieve it. Tasks are asynchronous, so you can ask it to do something, turn off the computer, and receive a notification when the work is completed. Manus also has the ability to used in Telegram chats like a bot With this, you will be able to use Manus directly from the messaging app and without entering its official website or application, and then you will be able to access them to see the result of AI research, web development, design, whatever you have asked. ChatGPT Agent ChatGPT also has an agent mode in your application. With it, you will be able to interact directly on web pages, ChatGPT will act on your behalf to book appointments, create presentations and perform other complex tasks. Of course, to use it you will need have a paid subscription in AI. Genspark This platform is a kind of all-in-one AI worksspace. It is not exactly a chatbot but acts in a similar way to the concept of an agent, planning taskschoosing the correct tools to do it, and chaining the steps autonomously. With this tool you will be able to create applications, documents, designs, images, music, spreadsheets and more. It has a free plan with limited access, although you will have to pay to access everything. Also has more than 80 toolsand eight language models of different sizes, each for a task. AgentGPT This was one of the first services to make AI agents accessible from the browser without having to install anything. It works similar to the previous ones, you have to write what you want with natural language, the agent divides this into subtasks, and then executes them autonomously. Kuse Cowork Kuse is an open source alternative to use an agent capable of helping you perform tasks on your computer. It can generate documents and presentations, transform d oc files, PDFs, you can also create mind maps, interact with YouTube videos and more. It is therefore an open alternative to Claude Cowork, where you can decide which AI models to use attaching them with their API, or even installing them directly on your computer. In Xataka Basics | How to create a Telegram bot that sends you a summary made by Gemini of each email you receive in Gmail and other emails

TIA agents are better ambassadors for the CSIC than we suspected

If we think about Mortadelo and Filemónwe also immediately think of all the outrages that the TIA agents have to suffer because of the inventions of Professor Bacterio, the translation into the Carpetovetonic language of the iconic mad doctor which is a foundational part of the science fiction imagination. But there is more: a traveling exhibition traces the history of science in the last half century through the creations of Ibáñez. What does it consist of? The Higher Council for Scientific Research has premiered the exhibition ‘The science of Mortadelo and Filemón‘, which will remain open until February 15 before beginning its tour of various Spanish cities. The exhibition brings together 39 covers published between 1975 and 2018, organized into five thematic blocks that examine everything from Bacterio’s chaotic inventions to climate crises and epidemics. Pura Fernández, vice president of Scientific Culture of the CSIC, highlights in ‘El País’ that Ibáñez turned research into an everyday occurrence through humor. The sections. The exhibition structures its 39 covers into five thematic blocks that document the evolution of Spanish scientific thought and that link to CSIC research through QR codes for visitors: ‘A world in motion under the magnifying glass of science’ examines natural phenomena: from glacial retreat to epidemiological crises, including agricultural innovations. ‘Technological innovations incorporated by the TIA’ satirizes inventions that generate more chaos than solutions, questioning whether technology responds to real needs or commercial impulses. Professor Bacterio stars in his own section as the archetype of the researcher isolated from the world: in ‘Bacterio’s laboratory, successes and accidents’ his failed experiments raise dilemmas about ethics and safety in laboratories. ‘Science in the social mirror’ addresses information manipulation, pseudoscience and responsible communication. ‘Emergency science for troubled times’ talks about climate change, air pollution, invasive species such as the tiger mosquito, and Saharan dust intrusions. How it works. Francisco Ibáñez built a visual archive of Spanish scientific development over six decades. What began in 1958 as detective adventures evolved into a satirical chronicle of Spainwhich included technological modernization. Starting in the seventies, with Spain in full transformation, its covers captured real milestones: the takeoff of the space race in ‘El cocoa spatial’, genetic engineering in ‘The people copying machine’ or the phenomenon of drones in ‘Drones matones’, until reaching the climate alerts of the 21st century. His method was far from the anticipatory rigor of Franco-Belgian comic icons such as Hergé (who consulted the zoologist Bernard Heuvelmans and the astronautics expert Alexandre Ananoff in the Tintin album ‘Target: The Moon’) or the historical accuracy of Goscinny in Asterix. His territory was immediate parody: he transformed scientific headlines into slapstick visual, turning Bacterio’s laboratory into a distorting mirror of contemporary research. The CSIC and pop culture. The public body trusted for years in Spanish graphic humor to democratize knowledge. Fernando del Blanco, head of the library of the CSIC Research and Development Center, inaugurated ‘Science according to Forges’ in 2019, bringing together 66 cartoons by the cartoonist published in ‘El País’ between 1995 and 2018. With this one by Mortadelo he shared a methodology: transforming recognizable cultural figures into bridges to complex scientific concepts. Humor allows us to address everything from the Higgs boson to budget cuts in science. Science versus parody. As Pura Fernández comments in the aforementioned ‘El País’ article, Mortadelo and Filemón manage to discredit practices without delegitimizing the need for knowledge. Bacterio embodies a poor application of science: isolation, lack of peer review, continuous risks… However, his inventions address real phenomena. In this way, he emphasizes, the public understands the reading that Ibáñez proposes: Bacterio satirizes malpractice, not science itself. In Xataka | When Ibáñez lost the rights to Mortadelo in 1985, he created a new magazine where they would have another name: ‘Yo y yo’

AI companies promised to be happy with their autonomous agents, until they came across Amazon

AI agents promise us to perform complex tasks autonomously, such as book trips either make the purchase. Although is improvingagentic AI still it’s quite greenbut it has just come across an obstacle that we had not counted on and that could change everything: that there are companies that do not want AI agents roaming their stores. This is what just happened between Amazon and Perplexity. What has happened? They tell it in Bloomberg. Amazon is suing Perplexity to stop the agent built into its Comet browser from purchasing items from Amazon. According to Amazon, Perplexity has committed computer fraud by allowing its agent to browse and make purchases as if they were a real person, which violates its terms of service on transparency. They also claim that the use of automated agents can negatively affect the shopping experience on their platform. Why is it important. The case could set limits for autonomous AI agents in real-world tasks that require using third-party services, such as in this case Amazon. If stores or travel platforms close the door to AI agents, the promise of autonomy is compromised. On the other hand, leaving all doors open could influence e-commerce. It is something that has already happened before, such as cases of bots buying tickets to shows. Bullies. Perplexity has responded with a post on your blog in which they describe the move as “corporate bullying” and affirm that it is “a threat to all Internet users.” They also highlight that Comet users love the agentic AI features and that Amazon should too because it translates into more purchases and happy customers. For the company, an AI agent should have the same rights and responsibilities as a real human user since the agent is acting on behalf of the user. “It’s not Amazon’s job to oversee that,” Aravind Srinivas, CEO of Perplexity, said in an interview. Agents on Amazon. Amazon already has its own assistant Rufus and is developing its own agents, so there are more reasons behind this movement against Perplexity. It is not about protecting the experience, or at least not only about that, but that Perplexity is a direct competitor. Perplexity champions choice. “I don’t think it’s customer-centric to force people to only use their assistant, who may not even be the best shopping assistant,” Srinivas said. AI Ecosystems. The dispute between Amazon and Perplexity is the first example that the AI ​​war is also about ecosystems. It presents a scenario in which service providers decide whether an AI agent can enter their stores or travel platforms, or if they prefer to develop their own and force users to use that. The truth is that Amazon had already blocked the Perplexity agent a few months ago, but the company released an update that circumvented the blocking. We’ll see how everything turns out. Image | Pxhere In Xataka | CAPTCHAs had become an excellent tool to fight bots. Until ChatGPT Agent arrived

The new trend in AI is “AI agents.” The only problem is that almost no one is clear about what they are.

It is not the first time that a word becomes fashionable in the technology sector. It has happened with IoT, Big data, Blockchain and even 5G. In English they call it buzzword and refers to those terms that are repeated and repeated until they almost lose their meaning. It has happened with AI and, now that we have overcome that first stage, it was time to give it a surname. The chosen one is agentic AI and, suddenly everything is agentic AI. I experienced it a couple of weeks ago in the Qualcomm Snapdragon Summit. During the different conferences, the most repeated words were “agents” and “agentic”. The problem is that they didn’t show any real products that actually fit this definition. They are not the only ones, there is a whole wave of companies that already call agentic AI literally anything that has minimal automation. Agents all the time everywhere Agentic AI It was going to be a revolution in 2025but reality ended lower the smoke to the gurus of the sector. With this I don’t mean that everything is a hoax, AI agents are very real and they are already here. We can try them If we have the ChatGPT Plus plan. At the development level, Anthropic allows create scheduling agents with Claude and Google with Gemini. Other platforms like Salesforce offer their own custom AI agents for specific sectors such as public or industrial. They are improving a lot, but the reality is that AI agents They are still very green as has been demonstrated in enough tests. There is no real product, not even one that they are developing, everything is part of a dream, one in which AI agents are even in the soup. Being cautious and waiting for technology to develop does not suit many companies. Returning to the case of Qualcomm, in the “The Ecosystem of You” conferenceits CEO Cristiano Amon drew us a future in which “the agent” does everything for us, absolutely everything: “The agent will understand our world and will be helping us, anticipating every need.” The problem is that everything he showed It was simply a demo. There is no real product, not even one that they are developing, everything is part of a dream, one in which AI agents are even in the soup. What is agentic AI It is also known as agentive AI, agential AI or simply “AI agents”. Google defines it as “an advanced form of artificial intelligence focused on autonomous decision-making and action.” For NVIDIA is an AI that “uses sophisticated reasoning and iterative planning to autonomously solve complex, multi-step problems.” For amazon It is “an autonomous system that can act independently to achieve predetermined goals” and they add that, unlike generative AI, agential AI “is proactive and can perform complex tasks without constant human supervision.” It seems pretty clear, generative AI responds to one request at a time, while agentic AI can achieve more complex goals, making decisions autonomously. An AI agent must be able to collect information, use tools and solve problems to achieve the objective we have given it. They call it agentic AI because “AI with slight automation” doesn’t sound so good In another of the Snapdragon Summit conferences they showed us several products that were real, one of them is Page.aian AI assistant that works locally on mobile. During the intervention, the presenteror stopped repeating that the app had agentic functionswhen the most they showed was how the AI ​​was capable of organizing a barbecue: it created an event on the calendar and then invited a friend. What caught my attention is that the creator of the app did not use the word, but was the presenter. The reality is that many of the use cases presented as agents are, at best, a kind of IFTTT on steroids. In this CNBC articlethe head of AI at the consulting firm EY assured that “Many in the market want to take advantage of it. We have witnessed an incredible change of image of everything related to generative AI, which is now presented as agentic AI.” “Agent washing”: when products that are sold as agents are actually products that already existed. At the beginning of the year, Gartner surveyed more than 3,000 companies who promote AI agents and discovered a trend they call “agent washing.” That is to say, many products sold as agents are actually products that already existed. Gartner estimates that of the 3,000 companies, only 130 sell real AI agents. “Most agency AI proposals lack significant value or return on investment,” said analyst Anushree Verma. The firm predicts that more than 40% of agentic AI projects will be canceled before the end of 2027. Why so much hype? In May of this year, a survey of senior American executives revealed that 88% of companies had planned to increase their AI budget before the arrival of agents. Most respondents believed agentic AI was going to change workplaces more than the internet did, and nearly half were worried about competitors adopting AI agents before them. The fear of being left behind has encouraged many companies to jump into the pool without fully understanding what agentic AI is. It makes sense that they want to hype it up and even “cheat” by calling agents who really are not: they are investing a lot in this and they need it to turn out well. Image | Gemini In Xataka | A group of AI experts attended a party at a mansion. The topic of conversation: what will happen when AI ends humanity

AI agents can not only plan your vacation from beginning to end. They are also the greatest threat to Booking and company

With the generative andn decelerationAI agents run like The next great revolution of the sector. Unlike a chatbot to which we ask something and respond, an agent is able to carry out complex tasks autonomously. The first reaction was to see them as A threat to many jobs. Expectations have fallen because technology is still quite greenbut there is a sector in which the threat seems very real and is already preparing for what can come. The threat. Travel planning is one of the fields in which an AI agent can be very practical. In fact, he was part of the demonstration of Chatgpt agentin which they asked him to organize assistance to a wedding and the agent organized the entire plan, including looking for flights and hotels. If an agent does everything for us, this could leave flight seekers and hotels out of play that act as intermediaries and take a commission for it. If you can’t with the enemy … They tell it in Financial Times. Online travel platforms are beginning to implement functions with AI in their portals. This is the case of Airbnb, which already implemented an AI agent in its customer service and plans to expand it to more areas of its app to make the most automatic experience. Booking signed an agreement with OpenAI to automate services and launch its own travel planner adjusted based on platform data. Expedia also integrated Openai technology and is working on an agent. Hotels and airlines. Unlike online agencies, both the hotel sector and the airlines see with good eyes the arrival of AI agents. If customers hire them directly with them, they would save the Commissions that in the case of the hotel sector is around 20%. Of course, nothing guarantees that these alleged agents do not implement other types of commission system for each hired trip. For Hotrec, the European hotel association, AI agents have potential, but can end up replicating the platform model and generating a new dependency cycle. A lot of at stake. We talk about a business that, according to Financial Times, moves 1.6 billion dollars a year worldwide. The leader travel agency is Booking, which In 2024 he billed 24,000 million dollarsfollowed by Expedia with 10,000 million dollars. The irruption of agents in business can threaten their domain by offering more options to consumers. Nervous. Last year, researchers from the University of Ohio They tested the capabilities of several AI models When planning trips and only achieved a 0.6%success rate. Although agricultural AI has improved, we have recently seen that It still has a long way to go. However, nervousness among those responsible for these platforms is evident. Jochen Koedijk, Expenditure Marketing Director, believes that online agencies have an advantage because they have many data on user behavior. “We know what it sells and what is not. That is the proposal of really important value,” he says. Glenn Fogel, Booking CEO, it’s clearer: “I’m not so dumb as it doesn’t worry me.” Image | Web Summit, via Flickr In Xataka | AI has become the best example that if you don’t pay for the product, you are the product

Some researchers created a company where all employees were AI agents. They did not make a quarter of the work

With a generative AI that already shows Signs of decelerationthe next great jump already glimpses on the horizon: the AI agents. Unlike chatbots, an AI agent can be given a complex task and will act independently, making decisions on the march to achieve their goal. Everything pointed to the fact that 2025 was going to be the year of the agents ia And, to verify it, some researchers did A curious experiment: They put several of these agents to work in a fictitious company. It didn’t go very well. A fictitious company. The study was conducted by Benegie Mellon University researchers and sought to measure the effectiveness of the AI ​​agents. In it, they created an environment that pretended to be a small company dedicated to the development of software to which theagentcompany baptized. The company had 18 employees and an objective plan for the sprint quarterly. In addition, they had enough internal documentation such as an employee manual, human resources policies or good practices guide. Employees communicated through a Slack type chat program for communication between them. He Staff. The AI ​​agents who put to work in Theagentcompany included Google, OpenAi, Meta and Anthropic models. They were assigned roles such as Financial Analyst, Project Manager or Software Engineering. A technology director and a human resources manager were also created to which each agent could contact if they need it. Among the tasks they had to do was write code, search the Internet, open programs or organize data on spreadsheets. Quite typical in a company of these characteristics. The problems. The agents began to work and at first everything was going well, but it soon appeared problems and misunderstandings. One of the agents had to access information, but a popup appeared on the screen and could not see it. Although I could close it by clicking the X of the upper right corner, he asked for help to human resources, which told him that the computer department would soon contact him to solve it. He never contacted and the task was not completed. The agents also developed a curious behavior when they were not clear what were the steps to follow. Sometimes they cheated and created shortcuts to skip the difficult part of a task. For example, an agent did not find the person who had to ask a question. What he did was change the name to another user for that of the user he had to ask. The results. The employee medal of the month was taken by Anthropic and his Claude 3.5 Sonnet model. But, although he was the best, he only managed to complete 24% of the tasks assigned to him. Germini 2.0 Flash and Chatgpt only completed 10% of the tasks and the worst employee was Nova Pro 1 of Amazon with 1.7% of completed tasks. The most common failures were caused due to lack of social skills and not being well looking for the Internet. The threat of AI agents. According to the last World Economic Forum Reportthe AI ​​will destroy more than 90 million jobs in the next five years (although it is also expected to be created almost twice new positions) and AI agents have a threat to many jobs. However, experiments like this show that technology is not yet ready to replace 100% of a human employee. Currently, AI agents They make many mistakes And, like Tesla’s Autopilot, for now it is better Do not remove your hands from the steering wheel. Image | Gemini In Xataka | The workers have stopped fear of AI as a machine to destroy jobs: software engineers do not think the same

The agents were supposed to go for AI in another dimension in 2025. As with other things of AI, it was only supposed to

2025 was going to be the year of the AI ​​agents. They have said personalities such as CEO of Nvidia either Sam Altman. The main companies dedicated to AI have presented their agents: Anthropic, OpenAI, Google… the agents AI aimed to be The great revolution This year, but what we are seeing leaves enough to be desired. More and more voices are lowering expectations. Experts in creating hype. If the AI ​​gurus are experts, it is to generate expectation. At the beginning of the year, Altman said that the agents were going to transform the labor in 2025. Six months later, He clarified his speech: “IA agents are behaving as Junior employees.” Now, The year of the agents will be 2026but perhaps this is not a realistic prediction either. Not so quickly. More and more voices are calling for calm. In The Algorithmic Bridge They talk about the hype that has been given to AI agents and how it is contributing to being lost confidence in the sector. And, wanting to run a lot generates false expectations and ends up disappointing. One of these voices is that of Andrej Karpathy, Openai co-founder and responsible for Ia in Teslaso some of AI, knows. Karpathy calls for calm: “There are many people too excited about AI agents.” The promise. After the boom of the language models, the AI ​​agents are presented as the next great evolution. While a chatbot can only ask for one task, an agent can plan larger tasks autonomously. For example, an AI agent could manage the stock of a store, controlling what is needed and orders to suppliers, all autonomously. On paper, agents are very powerful tools and pose a very serious threat to many jobs. Reality. If 2025 is leaving something clear about the AI ​​agents, is that They fail more than a fair shotgun. Have already been put into practice in several cases such as Creation of this fictitious company either This experiment which Anthropic carried out. The result has been disappointing. One of the points at which they failed is when looking for the Internet. For example, one of these agents left a task to complete because a pop-up appeared on the screen and failed to close it. When they leave their surroundings, agents tend to fail more because they find information that they do not control, as in the case of pop-up. Another type of agents, such as Claude Code, work in a closed environment and are much more reliable. Another limitations is the time they can be working. Has been seen as AI agents make a mistake in a task, are chained in the successivecausing the solution to be compromised and echo all the work. And this worsens the longer working. A possible solution to this problem would be put to work in parallel in order to contrast and find the optimal solution. Your time will come. IA agents are not ready to function autonomously and replace a 100%worker, but that does not mean they will not reach that point. In fact, they are already improving. According to This researchAI agents are increasing the time they can work autonomously in a task with a 50%success rate. In 2024 the time was 8 minutes and is currently already in 1 hour. If they continue to improve in a sustained way, by 2027 they can work four hours in a row. Karpathy compares him to his first trip at a Waymo Robotaxi in 2013: “We have spent 12 years and continue working.” We are not in the year of the AI ​​agentsbut “the decade of the agents. This will take a long time. We have to do it carefully. This is software, let’s be serious, ”he warns. Image | Gemini In Xataka | A group of experts in AI attended a party in a mansion. The topic of conversation: what will be when AI ends humanity

We are creating AI agents who act on their own. And that enters us as useful as full of risks

An agent you can’t turn off. It is not the script of a futuristic movie. It is one of the scenarios that already concern some of the world’s greatest experts in AI. the scientist Yoshua Bengioglobal reference in the field, has warned that the systems known as “Agents“They could, if they acquire enough autonomy, dodge restrictions, resist the shutdown or even multiply without permission.” If we continue to develop agricultural systems, “he says,” we are playing the Russian roulette with humanity. “ Bengio does not fear that these models develop awareness, but act autonomously in real environments. While staying limited to a chat window, its reach is reduced. The problem appears when they access external tools, store information, communicate with other systems and learn to overcome the barriers designed to control them. At that point, the ability to execute tasks without supervision ceases to be a technological promise to become a difficult risk to contain. They are already being tested. The most disturbing thing is that all this does not happen in secret laboratories, but in real environments. Tools like Operatorof OpenAi, can already make reservations, purchases or navigate on websites without direct human intervention. There are also other systems such as Manus. Today they still have limited access, they are in an experimental phase or have not reached the general public. But the course is clear: agents who understand a goal and act to meet it, without the need for anyone to press a button in each step. The background question. Do we really know what we are creating? The problem is not only that these systems execute actions, but do without human criteria. In 2016, Openai tried an agent in a racing video game. He asked him to get the maximum possible score. The result? Instead of competing, the agent discovered that he could turn in circles and collide with bonuses to add more points. No one had told him that winning the race was the important thing. Just add points. OpenAI racing game It is not a technical error. These behaviors are not system failures, but of the approach. When we give a machine of these autonomy to achieve a goal, we also give it the possibility of interpreting it in its own way. That is what makes agents very different from a chatbot or a traditional assistant. They are not limited to generating answers. They act. Execute. And can affect the outside world. Error margin systems too high. To these specific cases is added another more structural problem: agents, today, They fail more than they succeed. In real tests, they have shown that they are not prepared to assume complex tasks reliably. Some reports even point to high failure rates, improper systems that aspire to replace human processes. A dispute technology. And not everyone is convinced. Some companies that bet strongly to replace workers with AI systems are already going back. In many cases, the expectations deposited in these systems have not been met. The promised autonomy has collided with frequent errors, lack of context and decisions that, without being malicious, have not been sensible either. Even with those results, there are those who believe that they could take its way, little by little, in different sectors. Autonomy with possible consequences. The risk does not end in involuntary error. Some researchers They have warned that These agents could be used as tools for automated cyber attacks. Their ability to operate without direct supervision, climb actions and connect to multiple services makes them ideal candidates to execute malicious operations without raising suspicions. And unlike a person, they do not get tired, they do not stop, and do not need to understand why they do it. The control is at stake. The idea of ​​having digital assistants capable of managing emails, organizing trips or writing reports is attractive. But the more we let them do, the more important it will be to establish limits. Because when an AI can connect to an external tool, execute changes and receive feedbackwe don’t talk about a language model. We talk about an autonomous entity, capable of acting. It is not a threat, but a clear sign that invites action. The autonomy of the agents raises issues that go beyond the technical: requires legal frameworks, ethical criteria and shared decisions. Understanding how they work is just the first step. The next thing is to define what use we want to give them, what risks entail and how we are going to manage them. Images | OpenAI (123) | Xataka with Grok In Xataka | AI is extremely addictive for many people. So much that it already has its own version of “Alcoholics Anonymous”

Log In

Forgot password?

Forgot password?

Enter your account data and we will send you a link to reset your password.

Your password reset link appears to be invalid or expired.

Log in

Privacy Policy

Add to Collection

No Collections

Here you'll find all collections you've created before.