Most of the time the keyboard is our way of using chatgpt. The same goes for Gemini, Claude, Llama or any other market model. And yet direct interaction with our voice He earns whole. Talking with the machines little by little is being less strange. And less than it will be.
Call 4. In Financial Times They cite sources close to the development project of the new Meta Model. Call 4 – if it ends up calling himself – he will focus on improving voice interaction characteristics. There will also be options aimed at that future of AI agents, without a doubt, but the voice seems to be a special protagonist. This new model is expected to arrive “in the coming weeks.”
Native voice. Chris Cox, one of the maximum managers in Meta, indicated that he calls 4 will be an “omnimodelo” in which “the voice will be native.” Until now, he explained, the process was cumbersome. You had to convert voice to text, send the text to the LLM, get the answer in text and turn it in a voice again.
It is a revolution, they say in goal. That native conception of voice is an especially important option for interaction with chatbots but also with hardware. And here the target ra-ban can be the big beneficiaries. As Cox said, it is important “for the product interface, the idea that you can talk to the Internet and ask anything. I think we have not yet done to the idea of how powerful it is.”
All for the same. But it is that goal is far from being the only one who thinks that. Google has been offering for a long time Voice functions in Gemini In our mobile phones, and it has an advantage because we were already used to using Google Assistant. OpenAI We were amazed months ago with GPT-4o And that voice that even became professor of any discipline.
Elon Musk and his startup, Xai, have raised a Grok 3 of the most Parlanchín and that can adopt personalized tones as a “deranged” and another “sexy” to talk to us. Claude seems more lagging here, but even Alexa+, the new Amazon model, has a very strong conversational component with AI, something logical coming from where it comes from.
Almost human voices. And while the voice assistants of a few years ago offered adequate but somewhat flat voices, the current AI models achieve voices practically indistinguishable from human. Yesterday we talked about Sesame and of that synthesized voice that pauses and changes tone to adjust to the conversation as a human would. In that same career are others as Eleven Labs either
What to type. Although the keyboard has always had the advantage of allowing us to “think before speaking”, direct interaction with AI models seems much more powerful in many scenarios in which this conversation in real time is a winning option.
ID preparing you to talk to the machines. All these efforts are directed to the same place: talk to an AI. One that is probably at the moment in the cloud, but that can work immediately on our mobile, but also in connected glasses such as the finish line-very future there-some headphones or a smart watch. And as was the case with ‘Her’, we may see a lot of people who wear glasses or a headset and seem to talk alone. But that is actually talking with an AI.
Image | Warner Bros Pictures
In Xataka | Are we prepared to talk to the machines?