AI agents can not only plan your vacation from beginning to end. They are also the greatest threat to Booking and company

With the generative andn decelerationAI agents run like The next great revolution of the sector. Unlike a chatbot to which we ask something and respond, an agent is able to carry out complex tasks autonomously. The first reaction was to see them as A threat to many jobs. Expectations have fallen because technology is still quite greenbut there is a sector in which the threat seems very real and is already preparing for what can come. The threat. Travel planning is one of the fields in which an AI agent can be very practical. In fact, he was part of the demonstration of Chatgpt agentin which they asked him to organize assistance to a wedding and the agent organized the entire plan, including looking for flights and hotels. If an agent does everything for us, this could leave flight seekers and hotels out of play that act as intermediaries and take a commission for it. If you can’t with the enemy … They tell it in Financial Times. Online travel platforms are beginning to implement functions with AI in their portals. This is the case of Airbnb, which already implemented an AI agent in its customer service and plans to expand it to more areas of its app to make the most automatic experience. Booking signed an agreement with OpenAI to automate services and launch its own travel planner adjusted based on platform data. Expedia also integrated Openai technology and is working on an agent. Hotels and airlines. Unlike online agencies, both the hotel sector and the airlines see with good eyes the arrival of AI agents. If customers hire them directly with them, they would save the Commissions that in the case of the hotel sector is around 20%. Of course, nothing guarantees that these alleged agents do not implement other types of commission system for each hired trip. For Hotrec, the European hotel association, AI agents have potential, but can end up replicating the platform model and generating a new dependency cycle. A lot of at stake. We talk about a business that, according to Financial Times, moves 1.6 billion dollars a year worldwide. The leader travel agency is Booking, which In 2024 he billed 24,000 million dollarsfollowed by Expedia with 10,000 million dollars. The irruption of agents in business can threaten their domain by offering more options to consumers. Nervous. Last year, researchers from the University of Ohio They tested the capabilities of several AI models When planning trips and only achieved a 0.6%success rate. Although agricultural AI has improved, we have recently seen that It still has a long way to go. However, nervousness among those responsible for these platforms is evident. Jochen Koedijk, Expenditure Marketing Director, believes that online agencies have an advantage because they have many data on user behavior. “We know what it sells and what is not. That is the proposal of really important value,” he says. Glenn Fogel, Booking CEO, it’s clearer: “I’m not so dumb as it doesn’t worry me.” Image | Web Summit, via Flickr In Xataka | AI has become the best example that if you don’t pay for the product, you are the product

Some researchers created a company where all employees were AI agents. They did not make a quarter of the work

With a generative AI that already shows Signs of decelerationthe next great jump already glimpses on the horizon: the AI agents. Unlike chatbots, an AI agent can be given a complex task and will act independently, making decisions on the march to achieve their goal. Everything pointed to the fact that 2025 was going to be the year of the agents ia And, to verify it, some researchers did A curious experiment: They put several of these agents to work in a fictitious company. It didn’t go very well. A fictitious company. The study was conducted by Benegie Mellon University researchers and sought to measure the effectiveness of the AI ​​agents. In it, they created an environment that pretended to be a small company dedicated to the development of software to which theagentcompany baptized. The company had 18 employees and an objective plan for the sprint quarterly. In addition, they had enough internal documentation such as an employee manual, human resources policies or good practices guide. Employees communicated through a Slack type chat program for communication between them. He Staff. The AI ​​agents who put to work in Theagentcompany included Google, OpenAi, Meta and Anthropic models. They were assigned roles such as Financial Analyst, Project Manager or Software Engineering. A technology director and a human resources manager were also created to which each agent could contact if they need it. Among the tasks they had to do was write code, search the Internet, open programs or organize data on spreadsheets. Quite typical in a company of these characteristics. The problems. The agents began to work and at first everything was going well, but it soon appeared problems and misunderstandings. One of the agents had to access information, but a popup appeared on the screen and could not see it. Although I could close it by clicking the X of the upper right corner, he asked for help to human resources, which told him that the computer department would soon contact him to solve it. He never contacted and the task was not completed. The agents also developed a curious behavior when they were not clear what were the steps to follow. Sometimes they cheated and created shortcuts to skip the difficult part of a task. For example, an agent did not find the person who had to ask a question. What he did was change the name to another user for that of the user he had to ask. The results. The employee medal of the month was taken by Anthropic and his Claude 3.5 Sonnet model. But, although he was the best, he only managed to complete 24% of the tasks assigned to him. Germini 2.0 Flash and Chatgpt only completed 10% of the tasks and the worst employee was Nova Pro 1 of Amazon with 1.7% of completed tasks. The most common failures were caused due to lack of social skills and not being well looking for the Internet. The threat of AI agents. According to the last World Economic Forum Reportthe AI ​​will destroy more than 90 million jobs in the next five years (although it is also expected to be created almost twice new positions) and AI agents have a threat to many jobs. However, experiments like this show that technology is not yet ready to replace 100% of a human employee. Currently, AI agents They make many mistakes And, like Tesla’s Autopilot, for now it is better Do not remove your hands from the steering wheel. Image | Gemini In Xataka | The workers have stopped fear of AI as a machine to destroy jobs: software engineers do not think the same

The agents were supposed to go for AI in another dimension in 2025. As with other things of AI, it was only supposed to

2025 was going to be the year of the AI ​​agents. They have said personalities such as CEO of Nvidia either Sam Altman. The main companies dedicated to AI have presented their agents: Anthropic, OpenAI, Google… the agents AI aimed to be The great revolution This year, but what we are seeing leaves enough to be desired. More and more voices are lowering expectations. Experts in creating hype. If the AI ​​gurus are experts, it is to generate expectation. At the beginning of the year, Altman said that the agents were going to transform the labor in 2025. Six months later, He clarified his speech: “IA agents are behaving as Junior employees.” Now, The year of the agents will be 2026but perhaps this is not a realistic prediction either. Not so quickly. More and more voices are calling for calm. In The Algorithmic Bridge They talk about the hype that has been given to AI agents and how it is contributing to being lost confidence in the sector. And, wanting to run a lot generates false expectations and ends up disappointing. One of these voices is that of Andrej Karpathy, Openai co-founder and responsible for Ia in Teslaso some of AI, knows. Karpathy calls for calm: “There are many people too excited about AI agents.” The promise. After the boom of the language models, the AI ​​agents are presented as the next great evolution. While a chatbot can only ask for one task, an agent can plan larger tasks autonomously. For example, an AI agent could manage the stock of a store, controlling what is needed and orders to suppliers, all autonomously. On paper, agents are very powerful tools and pose a very serious threat to many jobs. Reality. If 2025 is leaving something clear about the AI ​​agents, is that They fail more than a fair shotgun. Have already been put into practice in several cases such as Creation of this fictitious company either This experiment which Anthropic carried out. The result has been disappointing. One of the points at which they failed is when looking for the Internet. For example, one of these agents left a task to complete because a pop-up appeared on the screen and failed to close it. When they leave their surroundings, agents tend to fail more because they find information that they do not control, as in the case of pop-up. Another type of agents, such as Claude Code, work in a closed environment and are much more reliable. Another limitations is the time they can be working. Has been seen as AI agents make a mistake in a task, are chained in the successivecausing the solution to be compromised and echo all the work. And this worsens the longer working. A possible solution to this problem would be put to work in parallel in order to contrast and find the optimal solution. Your time will come. IA agents are not ready to function autonomously and replace a 100%worker, but that does not mean they will not reach that point. In fact, they are already improving. According to This researchAI agents are increasing the time they can work autonomously in a task with a 50%success rate. In 2024 the time was 8 minutes and is currently already in 1 hour. If they continue to improve in a sustained way, by 2027 they can work four hours in a row. Karpathy compares him to his first trip at a Waymo Robotaxi in 2013: “We have spent 12 years and continue working.” We are not in the year of the AI ​​agentsbut “the decade of the agents. This will take a long time. We have to do it carefully. This is software, let’s be serious, ”he warns. Image | Gemini In Xataka | A group of experts in AI attended a party in a mansion. The topic of conversation: what will be when AI ends humanity

We are creating AI agents who act on their own. And that enters us as useful as full of risks

An agent you can’t turn off. It is not the script of a futuristic movie. It is one of the scenarios that already concern some of the world’s greatest experts in AI. the scientist Yoshua Bengioglobal reference in the field, has warned that the systems known as “Agents“They could, if they acquire enough autonomy, dodge restrictions, resist the shutdown or even multiply without permission.” If we continue to develop agricultural systems, “he says,” we are playing the Russian roulette with humanity. “ Bengio does not fear that these models develop awareness, but act autonomously in real environments. While staying limited to a chat window, its reach is reduced. The problem appears when they access external tools, store information, communicate with other systems and learn to overcome the barriers designed to control them. At that point, the ability to execute tasks without supervision ceases to be a technological promise to become a difficult risk to contain. They are already being tested. The most disturbing thing is that all this does not happen in secret laboratories, but in real environments. Tools like Operatorof OpenAi, can already make reservations, purchases or navigate on websites without direct human intervention. There are also other systems such as Manus. Today they still have limited access, they are in an experimental phase or have not reached the general public. But the course is clear: agents who understand a goal and act to meet it, without the need for anyone to press a button in each step. The background question. Do we really know what we are creating? The problem is not only that these systems execute actions, but do without human criteria. In 2016, Openai tried an agent in a racing video game. He asked him to get the maximum possible score. The result? Instead of competing, the agent discovered that he could turn in circles and collide with bonuses to add more points. No one had told him that winning the race was the important thing. Just add points. OpenAI racing game It is not a technical error. These behaviors are not system failures, but of the approach. When we give a machine of these autonomy to achieve a goal, we also give it the possibility of interpreting it in its own way. That is what makes agents very different from a chatbot or a traditional assistant. They are not limited to generating answers. They act. Execute. And can affect the outside world. Error margin systems too high. To these specific cases is added another more structural problem: agents, today, They fail more than they succeed. In real tests, they have shown that they are not prepared to assume complex tasks reliably. Some reports even point to high failure rates, improper systems that aspire to replace human processes. A dispute technology. And not everyone is convinced. Some companies that bet strongly to replace workers with AI systems are already going back. In many cases, the expectations deposited in these systems have not been met. The promised autonomy has collided with frequent errors, lack of context and decisions that, without being malicious, have not been sensible either. Even with those results, there are those who believe that they could take its way, little by little, in different sectors. Autonomy with possible consequences. The risk does not end in involuntary error. Some researchers They have warned that These agents could be used as tools for automated cyber attacks. Their ability to operate without direct supervision, climb actions and connect to multiple services makes them ideal candidates to execute malicious operations without raising suspicions. And unlike a person, they do not get tired, they do not stop, and do not need to understand why they do it. The control is at stake. The idea of ​​having digital assistants capable of managing emails, organizing trips or writing reports is attractive. But the more we let them do, the more important it will be to establish limits. Because when an AI can connect to an external tool, execute changes and receive feedbackwe don’t talk about a language model. We talk about an autonomous entity, capable of acting. It is not a threat, but a clear sign that invites action. The autonomy of the agents raises issues that go beyond the technical: requires legal frameworks, ethical criteria and shared decisions. Understanding how they work is just the first step. The next thing is to define what use we want to give them, what risks entail and how we are going to manage them. Images | OpenAI (123) | Xataka with Grok In Xataka | AI is extremely addictive for many people. So much that it already has its own version of “Alcoholics Anonymous”

We have a big problem with AI agents: 70% of the time are wrong

The AI agents They fail more than a fair shotgun. That’s at least what reveals A recent study of researchers from the Carnegie Mellon University (CMU) and the University of Duke. These experts have analyzed the behavior of several of them and put them to the test to check if this is a “much noise and few nuts.” AND At the moment it is. Inspiration. Graham Neubig, professor of CMU, explained In The Register how inspiration had been A 2023 article of OpenAi. It talked about what types of work could be replaced by AI systems, but as he said “his methodology was basically asking Chatgpt if those works could be automated.” In that study they wanted to verify it by asking various AI agents to try to complete tasks that theoretically carry out professionals of those works. In Xataka Everything begins by asking a thing to an AI. When the AI ​​is asked for others, chaos begins Theagentcompany. To carry out their study, the researchers created a fictitious company they called The Agent Company and they used it so that different agricultural models tried to complete various tasks. These systems should be able to use access to several services such as Gitlab, Owncloud or Rocketchat to carry out these works, but their performance was disappointing. 70% errors. The researchers used two trial environments called Openhands Codeact and Owl-Roleplay and in them they were testing the most important AI models today. The best of all of them is Claude Sonnet 4, which managed to solve 33.1% of the proposed tasks. Behind are Claude 3.7 Sonnet (30.9%), Gemini 2.5 Pro (30.3%) and, much further, disastrous GPT-4O (8.6%), call-3.1-405b (7.4%), QWEN-2.5-72B (5.7%) or Amazon Nova Pro V1.0 (1.7%). In the best case the models can complete 30% of the requested tasks, but they fail in 70%. Or what is the same: a lot of noise and few nuts according to these benchmarks. Unable agents. During these tests the researchers observed various types of failure in these tasks processes. Thus, there were agents refusing to send a message to colleagues who were part of the task, there were also agents unable to manage popup windows during navigation sessions, and even agents who cheated or cheated. In one of the cases, they highlighted, an agent who had to consult a person in Rocketchat (an alternative Open Source a Slack) did not find it, so “he changed the name to another user to give him the user with whom he had to contact.” But they are improving. Even with those problems, the evolution is being positive in the performance of these AI agents. Neubig and his team tested a software agent that was able to solve about 24% of the tasks involved web navigation, programming and some related tasks. Six months later they tested a new version and achieved 34% of completed tasks. {“Videid”: “X8HJ0VY”, “Autoplay”: False, “Title”: “Chatgpt: What you did not know what you could do | tricks”, “Tag”: “”, “Duration”: “790”} Imperfect but useful. Not only that: these researchers pointed out that even failing so much, AI agents can remain useful. In certain contexts, such as programming, a partial code suggestion with which to solve a certain fragment of a program can end up being the basis of a solution in which the developer can then work. Care where you use them. But of course, agents make so many mistakes can be a problem in scenarios more sensitive to these problems. Thus, if we commission an agent who writes Correos and sends them to incorrect people, the result could be a disaster. There are solutions in sight, such as the growing adoption of Model Context protocol (MCP) that facilitates the interaction between services and AI models so that communication is much more precise and these errors can be mitigated during the autonomous execution of tasks. A benchmark that makes the AI ​​models look bad. For this expert one of the great disappointments is that companies that develop AI models do not seem interested in using it as a metric to improve their developments. Neubig suspected that “perhaps it is too difficult and makes them look bad.” It is something similar to what happens With the benchmark arc -ag2: It is such a difficult test for the IAS that today The best of all models of which they try to overcome it is o3, which achieves – a 3% of completed tasks. In Salesforce they coincide. That previous study is complemented With another realized by a group of Salesforce researchers. They created their own benchmark specifically aimed at verifying how various AI models would be checked when controlling typical tasks in a CRM like those developed by the firm. His project, called Crmarena-Pro, tests those AI agents in areas such as the Sales or Support Department. In Xataka If the question is whether IA is already as good as human intelligence, the answer is: solves this puzzle To replace workers, nothing. In their conclusions, these researchers reveal how the AI ​​models “achieve globally modest success rates, typically around 58% in scenarios with a single shift (execution), but with the yield significantly degrading approximately 35% in multiturn scenarios.” In fact, they explained, “agents are not generally prepared or have the essential qualifications for complex tasks.” The risk of some experts, with A great impact of AI on various jobsit seems precipitated. A complicated future. To these discreet results, the prediction of the Gartner consultant is joined. According to your studiesmore than 40% of the development agents projects will end up being canceled at the end of 2027. The main person responsible for the report, Anushree Verma, indicated that “at present, most of the agricultural projects are experiments or tests of concept in the initial phase, mainly driven by advertising already often badly applied.” The message is clear: there are too many expectations in relation to AI agents, but the current state of technology shows that today its application is problematic and limited. Image … Read more

There is a risk with AI agents and accumulated errors: that they are a "sneaky phone"

In the game of the “sneaky phone” (or broken, or broken) a group of people transmits a message from one to a secret one. What usually happens is that the original message does not have much to do with what the last recipient receives. And the problem we are seeing is that something similar can happen with the promising agents of AI. Accumulated errors. Toby Ord, a researcher at the University of Oxford, recently published A study On AI agents. In it I talked about how these types of systems have the problem of accumulated or compound error. An AI agent chains several stages autonomously to try to solve a problem that we propose – for example, create code for a certain task – but if you make an error in one stage, that error accumulates and becomes more worrying in the next stage, and more in the following, and even more so in the next. The precision of the solution is thus compromised and may not have much (or nothing) to do with the one that would really solve the problem we wanted to solve. AI can program, but not for a long time in a row. What this expert raised was the introduction of the so -called “half -life” of the AI ​​agent, which would help estimate the success rate according to the length of the task that an AI agent wants to solve. For example, an agent with a half -hour life would have a 50% success in two -hour tasks. The message is overwhelming: the longer an AI agent works, the more likely the success rate declines. Benjamin Todd, another expert in AI, I expressed it differently: an AI can schedule for an hour without (barely) errors, but not for 10 hours. They are not real or definitive figures, but express the same problem: AI agents cannot – at least for the moment – function indefinitely, because accumulated errors condemn the success rate. Humans either are saved. But be careful, because Something very similar happens With human performance in prolonged tasks. In the ORB study, it was pointed out how the empirical success rate is falling remarkably: after 15 minutes it is already approximately 75%, after an hour and a half is 50%and after 16 hours of just 20%. We can all make mistakes when performing certain chained tasks, and if we make a mistake in one of them, in the next task of the chain that error condemns all subsequent development even more. Lecun already warned. Yann Lecun, who directs the research efforts of AI in the finish line, has been notaring the problems with the LLMs for a long time. In June 2023 Indian how the autregressive LLMs cannot be factual and avoid toxic responses. He explained that there is a high probability that the token that generates a model takes us outside the correct answers group, and the longer the answer, the more difficult it is correct. {“Videid”: “X8HJ0VY”, “Autoplay”: False, “Title”: “Chatgpt: What you did not know what you could do | tricks”, “Tag”: “”, “Duration”: “790”} That is why is the correction of errors. To avoid the problem, we need to reduce the error rate of AI models. It is something well known In Software Ingenería, where an early code review is always recommended following a “Shift Left” strategy for the software development cycle: the sooner an error is detected, easier and cheaper is to correct it. And just the opposite does not happen if we do not: the cost of correcting an error grows exponentially the later it is detected in the life cycle. Other experts They point to the Reinforcement learning (Reinforcement Learning, RL) could solve the problem, and here Lecun responded that would do it if we had infinite data to polish the behavior of the model, which we do not have. More than agents, multi -agents. In Anthropic They recently demonstrated How there is a way of mitigating even more mistakes (and subsequent accumulated errors): Use multi -legal systems. This is: that multiple agents of AI work in parallel and then confront their results and determine the optimal path or solution. The graph shows the length of the tasks that AI agents can completely complete over the last years. The study reveals that the time that an AI agent can operate to complete tasks with a 50%success rate can be folded every seven months. Or what is the same: agents are improving in a sustained (and notable) way over time. But models and agents do not stop improving (or not?). Todd himself He pointed something important and that allows to be optimistic about that problem. “The error rate of AI models is being reduced by half approximately every five months,” he explained. And at that rate it is possible that AI agents can successfully complete dozens of tasks chained in a year and a half and hundreds in another year and a half later. In The New York Times They did not agree, and recently pointed out that although the models are increasingly powerful, they also “hallucinate” rather than previous generations. The “system card“O3 and O4-MINI precisely points to the fact that there is a real problem with the error rate and” hallucinations “in both models. In Xataka | The hallucinations are still the Achilles heel of the AI: the latest OpenAI models invent more of the account (Function () {Window._js_modules = Window._js_modules || {}; var headelement = document.getelegsbytagname (‘head’) (0); if (_js_modules.instagram) {var instagramscript = Document.Createlement (‘script’); }}) (); – The news There is a risk with AI agents and accumulated errors: that they are a “squeezed phone” It was originally posted in Xataka by Javier Pastor .

The authentic battle will be fought by ia agents

The dispute over world supremacy that maintains US and China It permeates everything. The global economy, the most developed defense strategy, the commercial relationship between powers … and technology has an absolutely protagonist role in The delicate current geostrategic situation. Semiconductors and models of artificial intelligence (AI) are the resources used by the two countries with the greatest influence of the planet to measure its strength, and it is understandable that it is so. The range of applications in which the avant -garde and advanced AIs are crucial to guarantee the development of a country is very wide. Its scientific capacity, its industrial development, its economic competitiveness or its military power depend largely on these two resources. However, AIs are supported by semiconductors. Without integrated circuits of high density, high performance and high efficiency it is impossible to implement a really capable AI model. This is the reason why the US government is doing everything in your hand to prevent GPUs for avant -garde that design Nvidia, AMD, Intel or brains, among other companies, reach China. But for the moment the country led by Xi Jinping is resisting the pressure. Jensen Huang, the general director of Nvidia, has declared A few days ago, China is not behind in front of the US in AI. And the solvency of Deepseek, Ernie, Qwen, Pangu, Hunyuan or Sensenova endorses its analysis. The greatest growth potential is the AI ​​agents Right now it is very difficult to determine in an objective way which country leads in AI. It is reasonable to conclude that the US is ahead of China if we stick to the joint capacity and performance of its AI models, but the really relevant thing is to determine if that capacity entails a real value. This is The line of thought that defends experts As Arthur Lai, Chief of Research for Asia of the Macquarie Financial conglomerate, or Jason Corso, professor of AI at the University of Michigan (USA). The metrics that are currently used to evaluate the abilities and performance of the most advanced AI models are less and less clarifying In addition, it is important that we do not overlook that the metrics that are currently used to evaluate the abilities and performance of the most advanced AI models are less and less clarifying. And is that as The models improve and develop Its global competitiveness is equalized. During The Google I/O event last week the spokesmen of this American company said that Gemini is the fastest AI in the world because it reaches a speed of generation of Tokens ten times taller than Deepseek. An note before moving forward: the generation speed of Tokens It measures the speed with which a model of AI generates the answers, but it is only an indicator of the many that is necessary to use to evaluate the ability of an AI. Alibaba, on the other hand, assures that his last family of Qwen models surpasses his rivals if we stick to the ability with which he addresses mathematical reasoning or application programming. In this context, the most reasonable conclusion we can get is that each company affects those indicators that favor it. However, for users, the really important thing is the real value that an AI gives us. AND According to Lai, Corso and other experts The greatest growth potential has it AI agents and not so much the great language models themselves. An agent is an AI program that has been designed to make decisions for himself and behave in an autonomous way with the purpose of achieving a goal. The most important difference between an AI model and an AI agent is that the latter does not need us to tell him at every moment what he should do; Plan, analyze and execute tasks for yourself. This is the battlefield in which the companies that are dedicated to AI will compete, if they are not doing so. Image | Beyzaa Yurtkuran More information | Nikkei Asia In Xataka | The US wants to end the chips for the Chinese that are sold abroad. And China knows how to defend oneself

Agents are the great promise of AI. They also aim to become the new favorite weapon of cybercounts

The AI ​​agents are not the future: they are here. While chatbots like Chatgpt either Gemini They continue to gain ground in tasks that range from solving daily doubts to help you in programming tasks, large technological ones have begun to take determined steps towards a new generation of much more promising systems. They are able to execute tasks, make decisions and adapt to the environment. They not only respond: they act. And that change is presented as a very powerful advance. OpenAi is developing Operatoran assistant who can navigate pages, book trips or manage files. Anthropic proves your own agent with similar functions in controlled environments. Google works in Jarvis, his future digital butler. The idea is clear: delegate real tasks in artificial intelligences. But that same autonomy that makes them useful allies also makes them a potential risk for cybersecurity. Dangerous autonomy. Unlike traditional bots, AI agents are not limited to predefined instructions. They can control an operating system or make decisions depending on the context. In wrong hands, this autonomy could facilitate complex attacks without the need for human experts. Some laboratory tests already show how these models can replicate operations that previously required advanced technical knowledgesuch as automating spying tasks or manipulating system configurations. The threat begins to appear. Although there is no evidence that they are involved in large -scale cyber attacks, signs have begun to appear. Platforms like LLM Agent Honeypot, designed to detect suspicious accesses, have registered interactions with possible AI agents. In two confirmed cases, the agents responded to instructions embedded with a typical speed of language models, which points to their growing sophistication. We do not talk about organized offensives yet, but of an increasingly real phase. Cheaper, faster, more scalable. As Mit Technology Review points outone of the biggest risks is the potential for climbing. An agent can execute automated actions hundreds of times by a fraction of the cost of a human team. For criminals, that means expanding operations with unprecedented efficiency. If today the mass attacks require investment and specialized personnel, tomorrow they could be launched automatically, selecting objectives and exploring vulnerabilities without constant supervision. LLM Agent Honeypot operation operation scheme Detecting them is not so easy. Although current cybersecurity tools are effective against sophisticated threats, agents introduce a new type of challenge. Unlike classic malware, these systems can reason, adapt to the environment and modify their real -time behavior. This ability to mimic with legitimate traffic forces to rethink detection methods and to develop specific techniques to identify patterns of artificial intelligence. The industry is still exploring how far these systems can go. Some investigations show that, given ambiguous instructions, certain agents can execute unexpected actions. Although they still need human support to complete complex attacks, their evolution is rapid. And the most disturbing is not what they can do today, but what they could do tomorrow. And they will do it in an increasingly adverse scenario. According to checkpoint datain the third quarter of 2024, cyber attacks increased 75% compared to the same period of the previous year. Each organization suffered on average 1,876 weekly attacks. Sectors such as education, government or health are among the most beaten, and regions such as Africa, Europe and Latin America registered alarming growth. The hardware industry, for example, saw the attacks grow by 191% in just one year. More than 1,200 ransomware incidents were reported only in that quarter, mainly affecting manufacturers, hospitals and public administrations. If these types of attacks are delegated to AI agents capable of selecting objectives and launching chain offensives, the impact could be shot. The global panorama is tense, and the agents could be the multiplier that the attackers were waiting. Images | Xataka with chatgpt | Palisade Research In Xataka | There is a person who knows more than anyone in the world about password robberies. And they just steal his

Openai’s new voice models already speak as customer service agents. His next destination: the call centers

Since the beginning of the year, the objective of great technological ones has been clear: that we talk to artificial intelligence (ia). Openai, Microsoft, Google and Meta have added voice functions to their assistants. But this seems to be just the beginning. The industry advances at a frantic pace and the way we interact with these tools continues to evolve. Tell the voice agents ‘hello’. Sam Altman’s company has been betting on text agents with tools such as Operator either Computer-Useing agents. However, Openai already has it ready if next great movement to continue highlighting in the race for the development of AI: to promote a new and powerful generation of voice agents. New models on stage. OpenAI has announced The launch of new audio models to turn voice into text and vice versa. They are not in chatgpt, but in the APIwhere developers can use them to create voice agents. The important thing? They aim to be much more precise and to bring customization to the next level. The new OpenAI models, built on GPT-4O and GPT-4O-minipromise to improve Whisper Already its previous text to voice tools, which will also remain active through the API. But it is not just a matter of performance: now they can also modulate their tone to sound, for example, “as an empathic customer service agent.” Destination: the call centers. Openai makes it clear where they point with this launch. He assures that “for the first time, developers can tell the model not only to say, but also how to say it, which allows more personalized experiences for use cases ranging from customer service to creative narrative.” According to Openai, this technology will allow creating much richer “conversational experiences.” If we take into account that Chatgptpowered by GPT-3.5arrived in November 2022, it is evident that the progress has been vertiginous. And everything indicates that these models will end up arriving at the call centers. We might think that at first the interactions will be somewhat limited, but well above the current voice systems. They will move away from traditional automated assistants and will be much more natural. Over time, the line between a conversation with a person and an AI could become almost imperceptible. Images | Charanjeet Dhiman | OpenAI In Xataka | We have tried Sesame’s conversational. It is the experience closest to a “human voice” that we have seen In Xataka | China has found an unusual strategy to avoid US mosquadillas with AI: bet on the Open Source

AI agents are promising. But as in Tesla’s FSD, you better not take your hands from the steering wheel

AI agents are one of the great trends of AI This year. There are many expectations put in these models of AI capable of completing a task from beginning to end for us and almost if our intervention. And yet, one thing seems clear: for the moment it will be better “not to remove your hands from the steering wheel” and watch every step they take to prevent the AI ​​agent from being starring. Autonomy and trust. The Tesla driving assistance system –badly called Total autonomous driving (FSD For its acronym in English) – it requires that the user trust him to get carried away and that the car takes us from a point of origin to a destination without human intervention. IA agents propose a similar idea, to complete a task from beginning to end autonomously, but for this we must trust that they are able to do so. decision making. The agents will require huge data amounts and access to updated sources of information to analyze such data and then make decisions. In the past we have seen how AI models are especially good at the time of Summarize concrete information Or to draw conclusions from limited data, which is very useful for that decision making. Learn from mistakes. Tesla cars receive FSD frequent updates to improve their behavior. These updates are nourished by the data collected by the company when your FSD system is used, what allows you to polish the service. Something similar is expected to happen with AI agents, which will improve – especially at the beginning – when they are updated and “learn from their mistakes” when processing user requests. AI and companies agents. These types of solutions will be especially striking in companies that can thus automate processes that previously required total or partial human intervention. And precisely that is why this type of integration must be done in a very controlled way, because let’s admit it: we cannot trust 100% of the current AI models. Tesla knows that FSD is imperfect. It happens of course in the FSD of Tesla, which since its inception has been involved in various accidents, some of them with fatalities. One of the most recent was notified in October 2024: the low visibility made a TESLA with FSD activated a few months ago will run a pedestrian. Tesla has been criticized on numerous occasions of misleading advertising and of save the maximum on radars and sensors To achieve greater profit margin. AI agents can be equally dangerous if they are used incorrectly and “without having their hands in the steering wheel.” Users and companies that begin to use them must keep these risks very present. The hands behind the wheel, please. The conclusion was already clear in the Tesla FSD system, but also in the case of agents. They have barely done only appear on the market shyly, but everything indicates that this is one of the great trends of AI by 2025. And the problem is that the models of AI are imperfect and therefore can make mistakes, but it is that in the agents of that error it will increase. That they tell Air Canada, who had to return money to a passenger which obtained an erroneous response from the airline chatbot. Or to Chevrolet, whose chatbot was “deceived” by a user who achieved Buy one of your cars for a dollar. Domino effect. The accumulation of errors in sequential tasks is a fundamental problem in current AI models. We could say that it is something like the domino effect or the compound error: an error in an initial action distorts all subsequent decisions, generating results increasingly far from what expected. Imagine that in applications such as finance, medicine or logistics: consequences could be terrible. Solution: Constant supervision. To avoid this problem there are several proposed solutions. One of them is the establishment of check points. Thus, at the end of each subtarte the system-and ideally, a human user, what is called Human-In-The-Loop (Hitl)-should verify that everything is going well. It is also possible to minimize the risk using redundant systems – for example, using different models of AI so that the AI ​​agent uses them separately – or taking advantage of the information of the standard limits: if an intermediate fact thrown by an AI agent is too diverted from what is expected, we should rebound that process. And for the moment, spent (very) bounded. We are in a preliminary phase, and AI agents are “learning to drive alone”, so to speak. And the best way they learn is to go step by step and always starting with relatively simple and very limited scenarios. Thus, the ideal is to try to apply them to very specific cases and with a limited and known casuistry, so that their answers are as precise. Image | Erik Witsoe In Xataka | Microsoft is very important that the agents of AI are the great ball of the year. And is being reorganized to achieve it

Log In

Forgot password?

Forgot password?

Enter your account data and we will send you a link to reset your password.

Your password reset link appears to be invalid or expired.

Log in

Privacy Policy

Add to Collection

No Collections

Here you'll find all collections you've created before.