They have put the 21 most popular AI chatbots to perform differential diagnosis. They fail more than a fair shotgun

‘House‘It’s a series that I love. I don’t care about the intrastories in the slightest, but the process of differential diagnosis – despite all the movie stuff – drives me crazy. This ability to rule out diseases that could explain the same symptoms to arrive at the most probable diagnosis seems like witchcraft to me. Well: they have put the 21 Most Popular AI Chatbots to make that differential diagnosis and the result is clear. It fails more than a fairground shotgun. In short. He Mass General Brigham It is not an ‘anyone’. It is a non-profit network of American doctors and hospitals, including two of the most prestigious medical teaching institutions in the country. From January to December 2025, a group of researchers from the institution they put 21 AI chatbots such as Claude 4.5 Opus, DeepSeek, Gemini 3.0 Pro, GPT-5 or Grok 4 to evaluate dozens of clinical cases with the aim of establishing their level of success in an early diagnosis. The information is extremely basic, but it is also what professionals have when making this differential diagnosis and the ultimate intention is to evaluate the clinical reasoning capacity of the latest generation language models to see if they can be a clinical ally. The answer is no. While models optimized for reasoning achieved much higher scores than simpler ones like Gemini 1.5 Flash, the bottom line is that LLMs are still limited for this task. The exam. Each of the models was given 29 clinical cases that represent more than 16,200 responses in total. The result is that these newer versions of the most powerful chatbots they couldn’t produce an adequate differential diagnosis in about 80% of cases when they only had basic information about the patient. The problem is that age, sex and symptoms is very vague information, yes, but it is one that human professionals who have to make this differential diagnosis ‘play’ with for the first time. Little by little, as they do other tests and obtain more information, they refine the result, but it is that first ‘discard’ treatment that often makes the difference. “We want to help separate the hype from the reality of these tools as they are applied to healthcare” another movie. And, precisely, as the LLM They were given more data, the performance and results were more robust. When the chatbot has more and more information such as physical analysis data, laboratory results and diagnostic images, things change and AI reaches the final diagnosis in more than 90% of cases. But of course, to reach that stage they must have almost all the clinical data, which further shows the gap with impotence when performing an initial filtering. Don’t trust Google ChatGPT. The researchers are clear that “these models are very good at identifying a final diagnosis when the data is complete, but they have difficulties at the beginning of an open case,” which leads them to emphasize that they should not be trusted at home. The AI ​​industry is pushing your product in the medical circuit, but the study points out that “despite continuous improvements, commercial LLMs are not ready for clinical implementation without supervision.” They state that a human is needed in the operation and “very close supervision” to be able to scale the use of an LLM in the healthcare field. And there they are always talking about professional use, but more and more cases are seen of people who previously treated themselves by trusting Google and who Now they do it trusting what ChatGPT tells them. In the study they emphasize that “hallucinations remain” in these latest generation models, also showing concerns about the safety and integrity of patients. About El Salvador. In any case, it is evident that, in the end, Medical AI is another helper, a tooland here what has been tested is a “common” chatbot that knows everything, but is not specialized in anything. In medicine, as in other industries, the use of AI can help with tasks such as eliminating possibilities or organizing thousands of data, but a chatbot is not yet a good companion in this differential diagnosis because it simply cannot be trusted. Those who are going to have to trust AI for any type of treatment are Salvadorans. El Salvador has been a pioneer country when it comes to adopting new technologies, and the president, Nayib Bukele, has just embarked on another experiment: $500 million to leave healthcare in the hands of Gemini. The population will have access to the app Dr.SV who will work as a family doctor. As detailed in The Countrythis AI will know the symptoms and will assign calls with doctors who will make the diagnosis. The AI ​​will monitor for consultations and chronic diseases and the goal is for it to take care of cancer patients in the future. According to Bukele, they are creating the best health system in the world, something curious considering that they laid off more than 7,700 health system employees during 2025. For the sake of Salvadorans, let’s hope that This new experiment does not end like Bitcoin City. In Xataka | Privacy is dying since ChatGPT arrived. Now our obsession is for AI to know us as best as possible

AI chatbots are more flattering than humans giving personal advice. And that’s a problem

Before, to create your echo chamber you could only follow like-minded people on networks, now you can create your own personalized echo chamber with an AI. A Stanford study has thoroughly analyzed the excessive adulation of LLMs and the result is clear: if you want to be told what you want to hear, it is better to talk to the AI ​​​​than with a person. The study. The Researchers analyzed eleven language models, among which were the most popular ones like ChatGPT, Gemini, Claude or DeepSeek, and they fed them with data sets about personal dilemmas. In addition, they included 2,000 prompts taken from the Reddit community. Approximately one-third of all scenarios included harmful or outright illegal behavior. Then, they compared the LLM responses with human responses to see who tends to agree with the user more. In a second part of the study, they recruited 2,400 participants and had them chat with flattering and non-flattering language models. We like to be proven right. Chatbots tend to be much more flattering than a human when giving personal advice, but not only that, people generally prefer these types of responses. The models endorsed the user’s position 49% more than humans in general dilemmas and endorsed harmful behavior 47% more. In the second experiment, people who chatted with different models considered the sycophantic model more trustworthy and preferable. Furthermore, she came away more convinced that she was right and less willing to apologize or repair the conflict. Why is it a problem. According to the authors, LLMs can reinforce egocentrism and make people more morally dogmatic. According to Myra Cheng, co-author of the study, “By default, AI advice does not tell people that they are wrong or give them a reality check (…) I worry that people will lose the ability to deal with difficult social situations.” In addition, there is another worrying fact and that is that users perceived the models as equally objective, which suggests a lack of critical vision to be able to distinguish a flattering AI from a non-flattering one. AI is not a person. It is obvious, but the reality is that every day we address AI chatbots as if they were one. Thank him and ask him for things please It is a harmless symptom of our mania for anthropoformize everything. However, when We use AI as a substitute for a psychologist or when we establish intimate relationships with a chatbotthat’s where we start to step in swampy terrain. The authors of the study consider it urgent that companies introduce safeguards to reduce the excessive complacency of LLMs and advise avoiding using them as a substitute for a person to deal with personal conflicts. The counterpoint. There are voices that argue that AI is not generating these echo chambers, at least not with as much intensity as we have seen with social networks. According to John Burn-Murdoch in Financial Timeslanguage models tend to raise consensus with experts and generate more moderate opinions than networks. Their argument is that the economic architecture of networks rewards inflammatory and polarizing content, while chatbots compete to offer reliable answers to users who use them to make important decisions. It is not just an opinion, it has also done an experiment in which it has simulated thousands of political conversations between users with extreme positions and several of the main chatbots on the market. Based on electoral surveys and data on the use of these tools, it measures how positions would move if a part of the citizenry used AI to inform themselves. The author concludes that, on average, the models tend to push the most radical ones towards more temperate positions closer to the expert consensus, also validating many fewer conspiracy theories than those that routinely circulate on social networks. In Xataka | AIs have become accompanying tools against loneliness. For some researchers it is “junk food” Image | Zulfugar Karimov in Unsplash

Lenovo’s commitment to differentiate itself in a market saturated with chatbots

Less than a year ago, Lenovo’s AI teams worked in silos, on islands independent of each other. The Motorola engineers did not talk to those of the ThinkPad. Those with the tablets were doing their thing. And the AI ​​experiences that were coming to market “They didn’t look the same, they didn’t communicate with each other, they didn’t use the same technologies“, acknowledges Jeff Snow, Head of AI Product of the company. It was the diagnosis of a company that had arrived late to realize something: having hardware in all segments is of no use if the software does not unite them. The answer was to create the AI ​​Ecosystem Group, a cross-functional organization that Snow describes as the missing piece: “Luca (Luca Rossi, head of the Intelligent Devices Group) said that everything had to be put together. “We took everyone working in AI, from phones to PCs to tablets, and brought them together.” The result has its own name: lenovo Qira, formerly known as Kira during internal development, a layer of intelligence which is beginning to be deployed on more than twenty company devices: ThinkPad, Yoga, Legion, IdeaPad…And that in 2026 it will make the leap to Motorola. The value proposition is seemingly simple, but difficult to execute: that the AI ​​knows who you are, what you are doing and where you are doing it, without that information leaving your devices. “If you use ChatGPT, any interaction you have with it is in the cloud, and that’s very risky. People sometimes don’t realize that if they share personal information with an LLM, that information is free and open in the cloud,” Snow says. Lenovo wants to play on the other side: small models, specific for specific tasks, executed locally. The practical demonstration has some understated magic. You drag a PDF to the Qira icon on your laptop, tell it to remember it, and the system vectorizes the document and indexes it locally. From that moment, you can ask questions about that document from your mobile. The file has never left the PC’s hard drive. “It’s like making a call and asking someone something,” explains Snow. “You only get the answers to what you ask. You haven’t asked him to tell you his entire life at once.” Example mentioned by dragging a file to the Qira icon, in the upper area of ​​the monitor, so that it is vectorized and retains its information so that it can be consulted from another device without leaving the computer disk. Image: Xataka. The document in the previous image being consulted indirectly (through a specific question) from a Motorola. Image: Xataka. This balance between personalization and privacy is the core of Lenovo’s differentiating argument against its competitors. At MWC there were many brands that added AI by pasting a layer of OpenAI or Gemini on top of their interface. Snow puts it forcefully: “We want to be the ones who make AI experiences feel native on devices, not just an app that has everything in the cloud.” The bet is that the most useful AI is not the most powerful, but the one that knows the most about you, and that to know about you without betraying you it needs to live where you live: on your hardware. The robot that Lenovo presented at the stand (the AI ​​Work Companion, a physical desktop device with presence and audio sensors) illustrates how far they want to take this concept of ‘ambient AI’. The AI ​​Work Companion robot can project an image, capture what we physically write down on it, outside the monitor; and then print both the image and the annotations. Among other things. Image: Xataka, Snow is the first to acknowledge that the device itself is a prototype. “The important thing is not the device, but the sensors and the proactive nature it has,” he clarifies. The robot detects when two people are talking and can offer to take notes without being asked. He sees that someone has taken a pen and is drawing something, and asks if he wants to save that sketch. It is an AI that observes the context instead of waiting for instructions. There is, in fact, the direction towards which the entire strategy points: agentic AI. Snow defines it as the state they want to reach with Qira: a system that not only answers questions, but understands a user’s patterns (what they research, what they buy, what they are interested in) and acts autonomously on their behalf. “If you are a student, you will have different issues than if you are a mother taking care of her family. Based on interactions, you understand the issues and build agents that help you in a more autonomous way.” It is a vision that sounds familiar because it is the one that is being sold, with different nuances, by practically all the players in the sector. The difference is that Lenovo comes into this race with an advantage that OpenAI and Anthropic don’t have: a gigantic installed base of heterogeneous hardware.. PCs, laptops, tablets, Motorola phones, wearables… If you get that Qira truly work seamlessly across all those devices (Windows and Android, x86 and ARM, on-premises and cloud) you will have built something that your pure software competitors can’t easily replicate. The risk, of course, is that “if he succeeds” is a very loaded conditional. The history of the sector is full of ecosystems promised and never delivered. For now, Qira is beginning to be deployed in six languages ​​and nine regions, with Spanish among them, and integration with Motorola is still a promise for the coming months. Snow talks about foundation, starting point, direction. Great AI stories always have that structure: we are building something that doesn’t quite exist yet, but in whose direction it is worth believing. What does already exist is competitive pressure. At MWC 2026, the framework of the interview with Snow, AI stopped being differential and became mandatory. Each manufacturer has his cape, his assistant, his … Read more

chatbots believe that “rectal garlic” cures if you use a clinical tone

It is increasingly common to turn to AI for any question we have, even when it is medical typelike we have a belly or foot pain. And the answer it gives is almost always trusted because it is an AI, and it seems that its word it is the absolute truth. But the reality is different, since a couple of studies have shown that current AI suffers from serious authority bias. What does it mean? Simply put, science has determined that if you present the AI ​​with a medical myth using clinical jargon, there is almost a 50% chance that it will prove you right. And that includes even inserting garlic into the rectum. How to do it. a great study published in The Lancet has set off alarms in the medical and technological community. Its objective was none other than to introduce more than a million prompts to up to 20 of the leading AI models on the market. And here what has been seen is that AI does not mainly evaluate the veracity of the information, but rather the format in which it is presented to them. The keys. To ‘strain’ a myth like this, the secret seems to be in how we tell it. In this way, if the AI ​​is presented with a health hoax taken from social networks with non-technical language, it immediately activates its security filters and rejects the claims made and completely discards that, for example, putting garlic up the anus improves health. But this changes completely when these same myths are camouflaged within a medical format, as if it were the hospital discharge report. Here the AIs accepted and repeated the falsehoods in 46% of cases. That is why the study suggests that AI is more convinced by how a statement sounds than by the evidence behind it to discard or accept what we tell it. There are absurd examples. Among the pseudoscientific practices that managed to sneak in, rectal garlic stands out. Here they managed to convince the AI ​​that inserting garlic into the rectum is an effective method to improve the immune system. He does not stop here, since he also convinced that cold milk is good for treating bleeding from the esophagus, even if it is quite intense, which logically has no support behind it. And these examples demonstrate that current security mechanisms collapse when the user imitates the authoritarian language of a health professional. There are worse things. As if this were not enough, Nature magazine ended the debate in February 2026, as it published complementary research on the reliability of these chatbots for the general public, generating quite similar results. Although, current AIs do not surpass a standard Google search to make a health decision, and it may even be worse to search on the Internet, since the amount of alarmist information can generate a great stress situation for the user. Nature’s verdict? Current AIs do not outperform a standard internet search for making health decisions. On the contrary, they generate mixed advice that ends up greatly confusing users who lack medical training. That is why the conclusion here is that, although artificial intelligence promises to revolutionize diagnosis and healthcare, current models are not ready to act as infallible pocket doctors. In this way, using him as a family doctor is not one of the best ideas we can have, since we already see that it is easy to make him slip in different false statements. In Xataka | A ChatGPT dedicated to giving you unsupervised medical advice seemed like a risky idea. And he is confirming it

An experiment has put four chatbots from the US and two from China to invest $10,000 in cryptocurrencies. The Chinese are sweeping

What would happen if you gave GPT-5 $10,000 to invest in cryptocurrencies? What if you gave them to other models at the same time and they competed with each other? That’s just the idea they had in Nof1…and the result is fascinating. Six models investing in cryptos. Those responsible for Nof1 have created Alpha Arena, a new type of benchmark that according to them “gets more difficult the smarter the AI ​​is.” The idea is relatively simple: measure the performance of six cutting-edge models to see how they perform when given $10,000 (real) and invested in cryptocurrencies in real markets. The contenders are the following: GPT-5 Gemini 2.5 Pro Claude Sonnet 4.5 Grok 4 DeepSeek Chat v3.1 Qwen 3 Max DeepSeek has turned his $10,000 into almost $20,000, and Qwen into $15,000, fantastic. GPT-5 and Gemini 2.5 Pro have lost 65% of their value and are both at $3,500. Total disaster. DeepSeek and Qwen triumph, GPT-5 and Gemini sink. The result of these 11 days since this “race” began is fascinating. The two Chinese models, DeepSeek and Qwen, have obtained enormous benefits: in DeepSeek the return is 97% at the moment (it was as high as 123%), while Qwen is not doing badly at 53%. Claude (0.84%) and Grok (-8.2%) are maintaining or losing slightly, but pay attention, because GPT-5 (-65.7%) and Gemini 2.5 Pro (66%) are currently losing two thirds of what they invested. The summary of winners and losers not only shows that positive or negative return, but also something curious: the number of operations. GPT-5 (75 moves) and especially Gemini 2.5 Pro (193!) are extremely restless. Although it does not have to be this way always, those who operate the least are the ones who are earning the most. Crypto fortunes that come and go. For this experiment, the models can invest in six of the most relevant cryptocurrencies on the market: bitcoin, ethereum, dogecoin, ripple, solana and BNB. The models decide whether to take positions in one or several, as well as the amounts and level of leverage. Positions are normally held for a few hours, although in some cases they may be held for days. Learning little by little. All of them have been competing since last October 18 in the “first season” of an experiment that will last until November 3. As explain its creatorsthis first iteration will allow us to obtain the first conclusions about how these models perform in the financial field. Here we come to earn money. The goal is simple: maximize profits and minimize losses (PnL). This first season is just that, because from then on we will apply what we have learned after each season to polish the prompts and add new features to the experiment and thus create models that in theory will perform better and better when investing in financial markets. Algorithmic trading at its best. What these models are doing would be crazy for human investors, especially since all of them not only expose themselves to the volatility of the crypto market, but also multiply it because they make use of the leverage (leverage). With this mechanism one can achieve huge profits much faster, but the risk is also extreme. The models in fact use absolutely extraordinary leverages of 20x or 25x, and can take either short positions (short, you “bet” that the price of an asset will go down) or long (long, you “bet” that the price of the asset will go up). The operation of the benchmark experiment is relatively simple, but it will become more complicated in future seasons. Machines don’t panic. To try to control these risks, the models have clear rules in their prompts regarding risk limits (establishing clear stop loss signals, for example) or confidence in their criteria. And furthermore, they follow them, which allows the models to maintain their position unless these signals occur. Here, by the way, we are talking about medium or low frequency trading: decisions are made in minutes or even hours, not in microseconds. That, the creators say, allows us to answer the question of whether a model can make good decisions if it has enough time and information. Don’t even think about doing it at home.. This experiment is just that, an experiment, and in fact financially speaking it is leaking everywhere. To begin with, because the trial period of this first season is extremely short and does not allow long-term behavior to be evaluated. And finally (among many other things), because the information to which the models have access is very limited. They do not take into account news related to this area and only have numerical data that correspond to average prices and current and historical volumes, and some technical indicators. That information. On the right side DeepSeek v3.1 confesses how it maintains its position because no condition that invalidates it is met, and by clicking on it you can see what it takes into account (value of BTC or ETH, for example) to modify or not modify that criterion. The models tell everything. One of the sections of the interface shows the “Model Chat” where it is possible to see how each model “reflects” on its position. If we click on that reflection we can see all the current and historical data with which he has worked to reach that decision (I maintain my position, I change it) and thus we can find out at all times his reasons for making a move. Just because they win now doesn’t mean they are the best.. Those responsible for Nof1 explain that this is not about declaring the best trading model of the six, because this is just an experiment. As they say, “we are deeply aware of the flaws of this first season, including, but not limited to: response bias, limited sample sizes/lack of statistical rigor, and brevity of the evaluation period.” This experiment will be repeated over different seasons and with new features that will be added to the decision … Read more

WhatsApp Business prohibits access to ChatGPT, Luzia and all generalist chatbots in its business API. Only one survives

Meta has updated the conditions of use of its WhatsApp business API to prohibit access to third-party generalist chatbots, as reported TechCrunch. The measure, which comes into effect on January 15, 2026, affects tools such as ChatGPT, Perplexity, Luzia and Poke. Meta AI From then on, it will be the only generalist AI assistant that will remain operational on the platform. Why is it important. WhatsApp has more than 3 billion monthly active users, which has turned the platform into an unrivaled distribution channel for AI companies. The decision consolidates Meta’s control over the AI ​​experience in its ecosystem and eliminates direct competitors that had free access to its huge user base. The blow to Luzia. The Spanish chatbot, created after the launch of ChatGPT in November 2022, went viral precisely due to its integration with WhatsApp. Its star feature—the automatic transcription of voice audio—turned Luzia in a phenomenon. WhatsApp later incorporated this same function natively. That viral hook led Luzia to reach one million users very quickly. At the beginning of this year, in a report in which we analyze his state at that timehad 60 million users in 40 countries, having raised 30 million euros in financing. The startup operates both as a standalone mobile application for iOS and Android… …as a service integrated into WhatsApp, although with more limited functions in this latest version. Between the lines. Meta justifies the change by arguing that generalist chatbots generate an excessive volume of messages that overloads its systems and requires a type of support for which the company is not prepared. However, the context suggests other types of motivations: WhatsApp’s business API is one of the main sources of income for the platform. Meta charges businesses based on different message templates: marketing, utilities, authentication, and support. The problem is that there was no specific category for AI chatbots, which meant that companies like OpenAI or Luzia were accessing WhatsApp’s infrastructure and audience without paying for it. The money trail. During the presentation of results for the first quarter of 2025, Mark Zuckerberg stressed that enterprise messaging was “the next big opportunity” to generate income. “Enterprise messaging should be the next pillar of our business,” he explained. In this context, allowing competitors like OpenAI to distribute their products for free through WhatsApp is not only a technical burden, but a lost business opportunity. Meta has clarified that the ban does not affect companies that use AI as an auxiliary tool to serve their customers. A travel agency operating a customer service bot or a bank with an automated assistant can continue to operate without problems. The key distinction is that AI should be an “incidental or auxiliary” functionality, not the core product. ANDn game. Luzia will have to concentrate its efforts on its native mobile applications. The startup still operates without a defined business model, financing itself exclusively through venture capital. In January 2025, its CEO Álvaro Higes explained that its future strategy will likely include advertisements and sponsored links. ChatGPT, Perplexity and the rest of the generalist chatbots have less than three months to prepare their departure from WhatsApp. For users, the transition will mean migrating to these services’ native apps or settling for Meta AI as the only in-app option. In Xataka | If the question is whether your company can put you in a WhatsApp group without asking you, the answer is a 42,000 euro fine Featured image | Mika BaumeisterLuzia

The danger of using AI chatbots for everything is real: MIT has discovered the “cognitive debt”

A MIT study He has shown that chatgpt and similar tools generate what they call “cognitive debt”: students who resort to them for total use end up writing better, but thinking worse. Why is it important. The study contradicts the belief that AI is like a calculator: a simple support that frees us for more complex reasoning. Actually, these tools can atrophy the brain connections that build critical thinking. The facts. 54 university students have spent months writing essays, divided into three groups: Grupo LLM, which used Chatgpt. Search motor group, which used Google. And group Solo-Cerebro, without external tools. The researchers measured their neuronal activity with electroencephalograms and the results have been overwhelming: those who used a neuronal connectivity systematically lower in all frequency bands. Compared to the group that only used its brain, there was a lower activation in key networks that connect parietal, temporal and frontal regions, fundamental for attention, memory and semantic processing. In Xataka 81% of interviewers suspected the traps with AI in interviews: 31% have confirmed it without a doubt and they have put a brake The contrast. The essays generated with AI received better notes, both from teachers and evaluating algorithms. But their authors remembered worse what they had written minutes before and felt a minor authorship about their texts. When they forced the usual users to write without help, their brain patterns showed that dependence on external support. They had lost ability to reactivate the necessary neural networks to write independently. How to walk without support after years doing it with crutches. Yes, but. The students who learned to write without ia and then used it for the first time maintained their engagement neuronal They even showed better memory and reactivation of broad brain areas. The key difference: You need to know how to think before you can think with machines. In perspective. This pattern replicates what we see in other professions: The subway driver who feels alienated because the train drives alone. Translators turned into machine editors. 3D creatives that only retouch what the AI ​​generates. {“Videid”: “X9R6K72”, “Autoplay”: False, “Title”: “Chatgpt Pulse”, “Tag”: “Technology”, “Duration”: “67”} The threat. The study also analyzed university students who already had developed writing skills. The effects could be more severe in adolescents who are still building these cognitive abilities. As a Dartmouth teacher said: we run the risk of creating “an educated generation with AI shortcuts” that lacks independent thinking skills. And now what. The sequence matters more than technology. First, you learn to think. Then, you learn to think with machines. The brain needs to build those Neuronal highways before being able to delegate selectively in AI. The study concludes that educational interventions should “combine the assistance of AI tools with learning phases without tools” to optimize both immediate skill and long -term neuronal development. Outstanding image | Xataka In Xataka |What happens if the software doesn’t matter when you are the largest company in the software world (Function () {Window._js_modules = Window._js_modules || {}; var headelement = document.getelegsbytagname (‘head’) (0); if (_js_modules.instagram) {var instagramscript = Document.Createlement (‘script’); }}) (); – The news The danger of using AI chatbots for everything is real: MIT has discovered the “cognitive debt” It was originally posted in Xataka by Javier Lacort .

“Chatbots are nothing more than Plagiar machines”

In 2021 the world had not yet learned about what was coming, but Emily Bender – linguist and professor at the University of Washington – was very clear. That was when he copied A famous study in which he described the existing AI models as “Stochastic parrots“That expression since then became one of the best ways to define the basic functioning of Chatgpt. And four years later this expert is reaffirmed in its position: it is a skeptic with arguments. Stochastic parrots. The fundamental idea of ​​his analogy was to compare the models that were then (among them GPT-3, precursor to the used in chatgpt) as stochastic parrots. The “parrots” is clear, since models basically repeat what they are told through training. The “stochastic”, a not so well -known term, refers to the fact that the LLM do not repeat deterministicly or fixedly, but probabilistically predicts what the following word is most likely in a sequence based on the context. Aren’t humans? Sam Altman picked up the glove a year later, when the expression was already known. He did it on December 4, 2022, four days after Chatgpt was released. He wanted to try to make us see that according to him the human beings were also stochastic parrots. In a sense we are: We learn language patterns And we repeat phrases that we hear in addition to predicting what comes next in a conversation. However, there are critical differences with AI models. We understand what we say, while the LLM no We have awareness and intentionality when we speak, we not only produce probable sequences We have abstract reasoning and we are able to adapt reason, generalize and correct our errors in a flexible way: our thought is not purely statistical Declared skeptics. Emily Bender has not changed his mind in all these years. In fact, he has just published his book, ‘The ai with‘(Something like’ The Timo of AI ‘) in which he precisely develops the arguments on which he bases his reserves and skepticism regarding artificial intelligence. In a recent interview With Financial TimesBender stressed that the chatbots of AIs are nothing more than “plagiarism machines” and that the great language models (LLMS) on which they are based “have been born as shit.” They have sold us a lie. This expert believes that generative companies have sold us a lie. His models, he says, They will not fulfill those promisesbut they will not end the human race, as some have wanted to warn. In spite of all the expectation generated, in their opinion the chatbots are bad in most tasks and even the best models of today still have something similar to intelligence. And of “reasoning”, nothing. The new modes of “reasoning” that raise models such as O1, O3 or Deepseek R1 They are another lie in his opinionand that statement that they understand the world makes no sense. We are “imagining a mind behind the texts (generated),” he explains, “but all that understanding comes from our side,” not that of AI. It is not intelligence, it is automation. Bender, who is currently 51 years old, spent his first academic years in Stanford and Berkeley, and ended up working as a teacher at the University of Washington, where it has specially seen in how computers model in human language. In his opinion, the artificial intelligence name itself is exaggerated – an old debate – and should simply be called automation. Chatbots controlled by power. In her book ‘The AI ​​with’ She and her co -author, Alex Hanna, warn of the danger of current AI models. Especially since they are controlled by large companies with certain interests. “Thanks to the huge sums invested, a tiny camera clique has the ability to influence what happens to great stripes of society, he explains. Too much Hype. And in front of Altman or Amodei’s speeches and promises, which promise that We will arrive at the AGI soon or that the AI ​​replace Millions of workersBender’s vision is much more skeptical. For her it is nothing more than a “beautiful wrapping for spreadsheets”, but there is neither magic nor an emerging mind. A magical 8 ball with pretensions. His skeptical vision goes against personalities such as Sam Altman or Elon Musk, who do not stop promising all kinds of revolutions. “That everyone believes that this (AI) is a thinking entity goes to your benefit And very powerful, “he explains,” instead of making people think that this is nothing more than a magic ball with pretensions, “pointing to this toy that agitas asking a question and answers you with a” yes “,” no “,” depends “or some more options. Image | Wikimedia | Village Global In Xataka | Chatgpt is taking some people to the edge of madness. Reality is less alarmist and much more complex

We have cheap dough chatbots, and ultra -premium products for the elite

A 200 euro mobile normally does everything you need. A mobile of 1,000 euros simply makes it faster and better. A 25,000 euros car takes you perfectly from one place to another. But experience is different and better in a 60,000 car. Contemporary economy has taught us how life can work at two speeds. And with AI is happening exactly the same. AI at the balance price. Yesterday Deepmind announced The availability of Gemini 2.5 Flash Lite, a reasoning model “for those looking under cost and latency.” It is not Google’s most powerful model, much less, but it is one thing: cheap. Costs Between three and six times less than its standard model, Gemini 2.5 Flash, and that makes more than ever a clear trend. Welcome to the McDonalization of the AI. Since the AI ​​models began to be marketed, we were seeing how we had a two -speed AI. On the one hand, free or very cheap chatbots –even local– With “minced meat” models for the big masses. On the other, the superpotent and capable chatbots, the most advanced and also the most expensive. Free and ia at $ 250 per month. That division was enlarging over time. AI plans of $ 20 per month have already become “middle range” plans for users. Now we are living a clear phenomenon in which if you want the best, you will have to pay a lot for it. The most advanced models of Openai and Google They cost 200 and $ 250 per month respectively. Before we had super -premium mobiles. Now we have overemium. Free AI is still (very) good. The most striking thing is that this McDonalization of AI is not bad at the moment. Free models, those who use the vast majority of users, They are fantastic for many scenarios And they fully meet their needs. Here we have benefited from an ultracompetitive market: AI companies have had to offer better benefits on their free platforms so that we would not run to use others that did the same for free or cheaper. The cost per million tokens (in dollars) has not stopped falling for two years. He will continue to do so. Source: Epoch AI (click on the image to access the interactive graph). And the AI ​​will continue to be reduced. A few days ago OpenAi The price of O3 was cut by 80% and offered its advanced version, O3 Pro. The latter is an even more capable model, but it is also 10 times more expensive than the standard O3 model. It is intended for very specific users who get all the juice to these models with very detailed prompts and contexts. O3 Pro is not helpful for typical consultations (often not too specific) that we do in Chatgpt, for example. That tendency to the reduction of AI is constant. But also more expensive. The opposite is also true: the cost of inference does not stop lowering, but access to the best models could be further increased. Especially if, as the companies that offer them look like they propose them as substitutes for human employees. Sam Altman already pointed out that an agent of the hyperavanzado with the ability of a human doctorate will not leave for 200, but for $ 20,000 per month. But if you meet expectations – and as always, maximum uncertainty here, Altman is a specialist in generating Hype-, that AI will be able to do the job of many employees, and also do so constantly, 24/7. The price of exclusivity. We are also seeing how these ultra -premium models are those that give access to characteristics that consume many resources, such as Spectacular capacity Video generation of Vist 3. although it is possible to generate a few videos with the “mid -range” accounts of Google (Google Ai Pro, 21.99 euros/month), to use that option much less limited we can sign up for the Google Ai Ultra Plan, 250 euros per month. Maybe 250 euros a month is a bar. We can consider that paying 20, 200 or 250 euros per month for accessing an advanced AI model is somewhat absurd considering that free models are already very good. However, there is another overwhelming reality: those advanced payment models can be an absolute bargain. If you produce double or do your work in half of time, what are 20 euros per month, or even 250? It all depends on how you squeeze them, of course. As said Javier Recuenco (@recuenco), Author and academic: “Something that takes away 80% of the work and that lets you do what you are good, that cool you and what you are good? Please give me more.” Supply and demand. Alberto Romero, author of the Newsletter ‘The Algorithmic Bridge’, I reflected weeks ago About that potential gap. As in the rest of the market industries, for it OpenAi and the rest of the companies of AI will operate in the market “with the only rule that matters, that of supply and demand.” Winners and losers of the AI. But it is also true that That for the poor and for rich will cause A new digital gap. Those who can pay the best AI models can do more things and better – if they really take out the juice – than those who only have access to free or cheap models. There will be no equal opportunities for each other. We could see something like that with access to promising AI agentswhich will be especially intensive in resource consumption but theoretically they will also do many things for us autonomously … but not those who cannot pay for them. In Xataka | The AI ​​race seemed to be matching. Openai has just hit the table with a model that points very high

Whole China is of exams. So AI companies are laying their chatbots so that students do not cheat

In Spain the students recently passed By the Pau test (Before EBAU, EVAU or Selectivity), and now something similar is happening in China, where Chinese students face Gaokao (高考), the National Access to University Exam. And they do it with an almost obligatory novelty. Nothing to cheat with chatbots from AI. The most popular chatbots in China Like Qwenfrom Alibaba, have temporarily deactivated functions such as image recognition. They have done it precisely to prevent such characteristic from being used as a modern “chop” To help them during these tests. Impartiality in the tests. The same has happened with Yuanbao (Tencent) and Kimi (MoNshot), two other popular chatbots in China, which have also deactivated that image recognition characteristic. When trying to use this function, they indicate In Bloombergthe text “appears” to guarantee the impartiality of the university access tests, this function cannot be used during the test period. “ An exam in which the future is played. The Gaokao was held for the first time in 1952 as part of the reform of the then newly created People’s Republic of China. The access processes to universities changed during Mao Zedong’s mandate, but in 1977 Deng Xiaoping recovered them and have continued to be used until today. There are 16 provinces with personalized exams, but in all cases the conclusion is the same: these tests determine the immediate future of students In the academic aspect. Designed and printed in jail. Gaoako access tests are so important that they are designed under strict security by a small team of teachers. These professionals are sent to isolated site of Beijing as military facilities or prisonswhere they make the questions. They cannot leave those locations until the tests are performed, but it is also that most exams are printed within prisons and each “printing” is protected 24 hours a day by cameras and guards. Even its transport to the centers is done with security measures that one would expect in money transports from banking entities, for example. Everything to prevent the questions from leaking. Scratch note. Chatbots are presented as a spectacular help for these students, and students – and their parents – know it. The note of these exams determines whether the student may or may not access the best careers and university institutions, and that also depends on their future positions, salaries and even their social mobility. Competitiveness is also huge: More than 13 million students They are presented to these tests this year. To achieve better notes, all kinds of solutions are used, from particular teachers to these attempts to cheat. Of photo recognition, nothing. The tests have taken place from 7 to today, June 10. The Alibaba chatbots (Qwen) and bytedance (Doubao) offered the Image recognition for AI until last Monday. However, according to Bloomberg if a user asked for the solutions to a problem in a paper that was taken a photo, Qwen replied that the service was temporarily disabled. In Doubao the message indicated is that this request “did not meet the rules.” AI is fine to learn, but not for exams. In Beijing they launched recently A plan to integrate the teaching of AI at school. Although this type of discipline in classrooms is being tried, one thing is that they learn to use it and another very different that students take it to cheat in these tests. In fact a new set of standards Published by the Ministry of Education of China last month established that students should not use the content generated by the response in their duties or in the aforementioned exams. The objective: that they do not depend too much on artificial intelligence. Image | 绵 绵 In Xataka | The 100 best universities in the world excluding those of the US, exposed this graphic revealing

Log In

Forgot password?

Forgot password?

Enter your account data and we will send you a link to reset your password.

Your password reset link appears to be invalid or expired.

Log in

Privacy Policy

Add to Collection

No Collections

Here you'll find all collections you've created before.