usatoday24

Theodore Twombly, the main character of the movie ‘Her‘, fell in love with a machine called Samantha. He didn’t even need to see her or touch her. It was enough to listen to his voice, which was actually that of actress Scarlett Johansson.

That was science fiction, but little by little we are approaching a point where to fall in love with a machine is no longer. We have been seeing some time with replikathe AI service that allows virtual avatars to become our friends or something else.

The true GPT-4o revolution is to be able to talk to machines as if nothing

That service achieves it with an AI model that generates text, such as chatgpt. Until now we chatted with the machines, but little by little we are beginning to talk to them. Chatgpt’s voice modes precisely give that optionand in fact the company He had to withdraw one of his voices for being too similar to the Scarlett Johansson.

But now an artificial intelligence startup called Skew me has gone one step further. At the end of February the company He published a demonstration of its voice conversational generation model (CSM, by conversation Speech Model), and its impact has been remarkable.

Some users have informed of feeling an emotional connection with the male and female voices of the model (“Maya” and “thousands”). One of them, who published his impressions in Hacker News, explained How “I am even a little worried about whether I start feeling emotionally linked to a voice assistant with this level of so human sound.”

Anyone can try to speak with Maya or thousands Thanks to that demo on the Sesame website. The only obstacle is that conversations must be in English: these models do not speak other languages at the moment.

I just did it for a few minutes, and the operation of this conversational chatbot is really surprising. The voice is warm and close, but above all I perfectly imitate the way a person would speak. With pauses, doubts or intonation changes. The voice generation is instantaneous, there is no latency, and certainly the sensation is to be having a conversation with another human being. It’s strange, exciting and disturbing at the same time.

As they explain In his blog Those responsible, “in Sesame our goal is to achieve a” presence of the voice “, that magical quality that makes oral interactions look real, are understood and valued.” They are pointing to something similar to what Replika pointed out: to create “conversational companions” that offer a genuine dialogue with which to build some confidence over time.

These models are not perfect. Maya, for example, has demonstrated do strange things From time to time, but comments on Some forums of discussion like this Reddit They make it clear that the quality of these models is spectacular.

If you want to check the quality of this model, attentive to this. Source: Reddit.

And if you do not believe it, take a look at this conversation that Gavin Purcell, one of those responsible for the podcast Ai for Humanshe posted on Reddit arguing unsuccessfully with the machine to try to find its limits.

It does not seem to achieve it, and in fact it is impossible to detect that one of the interlocutors is a machine. His speed of answer, his changes in tones, his choice of phrases and words … is amazing. Sesame’s conversational chatbot It also allows you to interpret different roles (“Roleplaying”), something that for example Openai usually limits.

Openai has been working on their voice modes for chatgpt, and Grok 3 has also implemented different synthesized voices and also adjust to diverse personalities. There is even a “deranged” and another “sexy” voice, for example, which demonstrates once again that Musk and Xai do not mind experimenting

As they comment In Ars Technicain Sesame they have achieved this advance thanks to two models (one trunk and another decoder) that work together. Both are based on architecture calls, and Sesame has raised three different sizes. The largest of all combines a trunk model of 8,000 million parameters with a decoder of 300 million, which results in a joint 8.3b model. To train it they have used a million hours of audio files in English.

The comments In a debate In Hacker News they made it clear that the quality of Sesame’s voices is almost human, but even users continued to notice that something failed. One of Sesame’s co -founders, Brendan Iribe, I participated In the debate thanking those comments and confirming that they still have a lot of work ahead. Is “still too anxious Often inappropriate in his tone, prosody and rhythm “, He explained, and has problems with the interruptions, times and fluidity of the conversation. “Today we are firmly In the valley (disturbing)“, he said,” but we are optimistic and we can get out of it. “

Your daughter calls you, it's in danger, you ask you thousands of euros immediately. On the other side of the phone there is an AI

The possibilities seem almost unlimited for these types of models, but they are both for good and for worse. Its use to supplant identities, for example, has already given some serious scares. Here is the Creation of a “family password” It can be very useful to avoid part of those problems, although at the moment you are not allowed to clone voices.

We will see how AI companies react to these types of problems, but everything indicates that this future in which We will talk constantly (and we will even fall in love) with the machines It is getting closer.

In Xataka | Be careful with falling in love with your chatbot: in Openai they warn that GPT-4O can reduce the need to socialize with human beings

We have tried Sesame’s conversational. It is the experience closest to a “human voice” that we have seen

What do you think?

The human voice loses ground and automatic dubbing with AI makes its way between creators

Robots will overcome their human employees in number

convert schools into tourist experience

Tesla Robotaxis have been programmed to drive as a human. So when they see the police hit a brake

We have found strange prehistoric spheres in the middle of the Amazon. Inside, human bones and animals

If you ask the IAS to choose a number between 1 and 50, they usually choose 27. The reason is very human

China has been wondering what to do with its 300 million pensioners. It has a “voluntary” solution

The US tried to the desperate strangular the Chinese chip industry. It has taken two months to back down

The ‘Sohamgate’ has highlighted the recruitment of Silicon Valley

China has been launching the same message to the world about Taiwan. The date was 2027 … until the US bombarded Iran

A folding mobile at outlet price. This is the new and spectacular offer in the honor Magic V2

Leave a ReplyCancel reply

Aemet’s big question is if 2025 will definitely end the drought

We do not need robots that look like us. We need robots to do things for us

We increasingly understand the relationship between intestinal flora and sleep quality

The day that United Kingdom invaded Tenerife without knowing what was inside

A few months ago Ryanair raised her salary to her employees in Spain. Now he is claiming that they return it

Spacex has always been 10 years ahead of the competition. The problem is that in China that law no longer applies

This is how the curious creator of LG wallpapers works

They have convinced us that thin mobiles have to sacrifice drums. It is half truth

What do you think?

Leave a ReplyCancel reply

Log In

Sign In

Forgot password?

Your password reset link appears to be invalid or expired.

Log in

Privacy Policy

Add to Collection

No Collections

Hey Friend!Before You Go…

Hey Friend!
Before You Go…