For OpenAI, 2026 will have a clear protagonist: voice

In the last two months, OpenAI has unified several engineering, product and research teams with a single objective: to revolutionize its audio models. The startup is preparing a more natural voice model for this first quarter of 2026, capable of managing interruptions and speaking while you speak, according to a report from The Information. Why is it important. This movement not only seeks to improve ChatGPT, but also to place audio as the main interaction interface, moving screens to the background at least in certain use cases. This is what first-generation smart speakers tried, unsuccessfully, a decade ago. The bet is to build personal devices that work exclusively by voice, with a launch planned for mid-2027. The context. Silicon Valley has been heading in this direction for months: Meta added five microphones to his Ray-Ban Meta 2 to isolate voices in noisy environments. Google is testing audio search summaries. Tesla is going to integrate Grok in their cars to be able to control certain aspects conversationally. In detail. The initiative is led by Kundan Kumarformer researcher of Character.AI which arrived at OpenAI this summer. The new model seeks to sound indistinguishable from a human voice and maintain fluid conversations without the typical cuts of current assistants. Besides, the May 2025 purchase of io Products Inc.Jony Ive’s $6.5 billion startup, marks a turning point. Ive, former head of design at Apple, now leads creative responsibilities at OpenAI with a team of 55 people. Its philosophy, already publicly announced, seeks to reduce addiction to devices through interfaces that do not require constant visual attention. What is happening. OpenAI contemplates several formats: screenless speakers, smart glasses (a clearly booming segment) and a pen-shaped, voice-operated device. Foxconn will manufacture the first product, rumored to be a context-aware pen, in Vietnam. These devices are positioned as complements to laptops and mobile phones, not as substitutes, at least for now. Yes, but. Not all “screenless AI” bets have worked. The Humane AI Pin burned hundreds of millions and defrauded its buyers by offering a half-hearted product that would stop working after the company was sold to HP. Several pendants have been in a similar line for almost two years, without any to date having managed to go beyond being a curiosity. And now what. The schedule is quite tight: New audio model before spring 2026. First dedicated device for sale a year later. OpenAI will go from being a software provider to competing directly in consumer electronics. The question is whether they will achieve what Humane and others have failed to achieve: make people want to talk to their devices without being able to look at a screen. In Xataka | The new Ray-Bans from Meta will allow you to cross a line: seem present while you are completely absent Featured image | Xataka with Mockuuups Studio

Sateliot is the great Spanish hope to have its own voice in the new satellite space race

There is a new space race and no one wants to miss it. Rivaling with Starlink seems like a utopia, but a Spanish company has managed to get ahead to the American giant on a specific point: 5G. While Elon Musk’s satellite company remains anchored in 4G, Sateliot boasts of being a pioneer in offering 5G connectivity from space, not only to IoT devices, but also to conventional mobile phones. This milestone has not gone unnoticed by the governments of Spain and Europe. Sateliot brings together all the ingredients to become an option for technological sovereignty in the satellite race. A race where Starlink dominates with more than 90% of global launches, but where any advance of its own is seen as a great victory. Now Sateliot inaugurates the Europe’s first 5G satellite development center. A pioneering center located in Barcelona that has more than 100 employees, two laboratories, a control room and a clean room of more than 100 square meters. From Xataka we have visited the center of the Catalan satellite company and learned about its ambitious plans. Triton, the new generation of satellites moves to full 5G Since 2018, Sateliot has launched six satellites, the last four in orbit since August 2024. They plan to launch five more next year. However, beyond getting ahead with 5Git will be with their second generation of satellites when they will begin to have a more competitive service. Triton, in homage to the Montseny amphibian, is the name chosen for its new satellites, about four meters long and 150 kilograms in weight. These new satellites represent a radical advance compared to those already sent by Sateliot, because in addition to having a capacity up to 16 times greater, they also change their concept. Tritón not only offers connectivity to IoT devices, but will offer 5G connectivity for data, voice and video to conventional 5G mobiles. Without the need to add any antenna or modifications to these phones and compatible with all operators (3GPP). The satellite, with a cost 10 times higher than the first generation, will allow Sateliot to offer a service that will range from critical security applications to civil protection and defense. The company explains that its satellite connection service will not focus on providing specific coverage to specific consumersbut serve for industrial, maritime, energy or location applications. Jaume Sanpera, CEO of Sateliot, together with the monitoring of its four satellites in orbit The first Triton satellite is scheduled to launch during the first quarter of 2027from Vandenberg (California), one of SpaceX’s two launch bases. The future goal is to be able to use European launchers, such as the Vega and Ariane of the European Space Agency. In this space race, the dates given are no coincidence. 2027 is the date on which it is also planned that Starlink begins upgrading its satellites to 5G. Barcelona bets on aerospace technology Jaume SanperaCEO of Sateliot, is proud that his satellites are “100% manufactured in Barcelona.” Now they have inaugurated the development center, but in the future they plan for the industrial phase to also have a factory in Barcelona. A phase that is still far away. “Next year we will exceed 200 employees. Being more than 80% engineers and having doubled the staff in the last year,” Sanpera explains to Xataka. “We have agreed to expand to the ground floor,” he points out in reference to the recently inaugurated offices. An inauguration that was also attended by multiple public authorities, including the president of the Generalitat of Catalonia, Salvador Illa. “You have to lose your shyness. Everything outside is better and seems to come from the US or China. Well no: Here we also do very powerful things that no one else has“Illa defended. Salvador Illa, president of the Generalitat of Catalonia, visits the clean room of the new 5G satellite development center | Satellite Sateliot is a startup that currently brings together much of what Europe is looking for: cutting-edge technology companies and local development. The new development center wants to become the base of a cluster of aerospace companies in Barcelona. And investors are taking note. Sanpera assures that at this time Sateliot is not looking for a new round, although defines it as a company “that requires a lot of capital”. Last March, the The Spanish government announced an investment of around 14 million euros in Sateliotfor a total of a round of about 70 million euros. In addition to the Spanish Society for Technological Transformation (SETT), Global Portfolio Investments, Indra, Cellnex and SEPIDES have also invested and 30 million euros have been loaned from the European Investment Bank (EIB). For the moment, since his birth They have invested about 50 million euros in R&D. According to Sateliot, they already have signed contracts worth 285 million euros annually and offer coverage in 58 different countries. In total 734 different contracts to connect a total of 10 million devices that cannot have good coverage and where the satellite service opens a whole field of possibilities. The new development center in Barcelona employs 110 employees (80% engineers), with plans to exceed 200 in 2026. “We have 30 different patent applications“, they explain to us. During the explanation of how satellite monitoring works, the CEO of Sateliot hints that not all of its advances have been patented, in order to “not give clues to the competition”, pointing out that there is a high level of industrial espionage in the sector. “The difficulty is in the radio, in the antenna,” says Sanpera. Sateliot cannot compete against Starlink in quantity, but unlike the American company, they are betting on satellites whose connectivity is more modern and, above all, widely compatible. The Triton satellites have a 7 year shelf lifecompared to four or five years for the first generation. The main limiting factor is the radio and software. The company points out that this information is important, because “space debris is a problem for everyone and can prevent us from launching more … Read more

the look that became a voice and guide

In 2019 we published a 37 minute documentary about Dulce, a girl with motor paralysis who learned to communicate using only his eyes and a system of eye-tracking by Irisbond. When she started with him, she was six years old. The learning process had just begun. The eighteen months of recording culminated with a moment that summed up the entire effort: in front of her classmates, using her communicator, Dulce announced “my mother has a baby.” Pure manifestation of desires, willingness to share. Perhaps the first time he not only named the world but shaped it. Six years later, we have spoken again with Raúl, his father. Today Dulce is thirteen years old, her brother Max is already ten, and Dante, that baby who was beginning to appear in Raquel, is already five years old. The communicator is still your voice, but what has changed is what you say with it and what you use it for. From spectator to teacher When we met her, Dulce was learning to use the device with the patience of first Celia and then Mariano, her educators. He burst virtual balloons on the screen, related pictograms with concepts, constructed basic phrases. The process was methodical and exhausting: each session required prior calibration, sustained concentration, and the diffuse promise that that, one day, would give him communicative independencesomething very remote then. Dulce introducing herself at one of her talks. Image provided. Now Dulce is on the other side. Not only does she now master the system, but she has become a trainer for other communicator users through Gema Canales Foundation. “She is as a teacher, teaching other children to use communicators because she is very good at it and has a lot of patience,” explains Raúl. “He’s taught three or four kids how to use the system already.” It is not a specific activity. According to her father, it is something she would like to continue in the future, when she is an adult. The communicator is no longer just his or her tool of expression, but also what he or she trains others in.. The transformation is complete: from student struggling to articulate simple ideas to mentor capable of transmitting technique and patience to others. Teenage conversations The most notable thing is not the technological leaps—which there have been, although moderate—but the communicative leaps. In 2018, Dulce was pronouncing single words, constructing short sentences and expressing basic desires. Six years later he has more complex conversations. “She has normal conversations of a 13-year-old teenager,” says Raúl. Image provided. The most notable change came with mobile phones. Dulce already has her own, not as the main communication device – for that she continues to use the Irisbond system connected to a tablet – but as a gateway to digital socialization typical of their age. The mobile allows you to access WhatsApp and have conversations with friends, a teenage rite of passage. Although he accesses through WhatsApp Web for fluidity and convenience, he also likes to use his cell phone with the mobility that his left hand allows. This communicative autonomy has also changed its social dynamics. Raúl remembers moments when Dulce, in new environments with strangers, starts conversations using her communicator. The other kids quickly naturalize the system: “Oh, okay, I talk and she answers me like this.” There is no discomfort, just a slight adaptation to the pace of the conversation, which is slower than natural speech but fluid to maintain complete dialogues. The voice that doesn’t want to change Technologically, the system has not evolved a lot in these six years. The most important improvements occurred in the years before the documentary, when the eye-tracking It went from crude to functional. Since then, progress has been incremental. The response speed has improved slightly, the software is somewhat more predictive, but nothing transformative. The most interesting thing is that Dulce has resisted changing the voice of the communicator. The system has been updated with more voices, even from children, not just adults, as some parents had been demanding. Image provided. When the tool added the first children’s voices, Raúl went “with all his enthusiasm” to configure it on Dulce’s tablet, but he found something unexpected: her refusal. She preferred to keep the one she has been using for years, with an adult ring. “She’s already gotten used to that being her sound. It’s like your voice changes overnight, you feel strange, you don’t recognize yourself in it.” His father speculates something obvious but easy to forget: when you’ve spent most of your life hearing yourself speak in one way, changing your voice is not an improvement, it’s losing your sound identity. The limit is still physical Dulce finished primary education with excellent grades, with the only curricular adaptation in Physical Education. Now he is in 1st year of ESO and limitations are beginning to appear, not due to cognitive ability but due to motor demand. Mathematics, which in Primary was numbers, now introduces algebra. “There it could get more complicated for her,” admits Raúl. The solution involves an assistant who transcribes what Dulce indicates with her communicator, a necessary support not because she does not understand the subject but because writing equations with her eyes is infinitely slower than by hand. It is a technical limitation, not an intellectual one, but it sets the pace of your academic progress. The impact of the documentary The 2019 report did not change Dulce’s life or that of her family. There was no media transformation or avalanche of attention. But Raúl remembers a very specific effect: when they had meetings with the Madrid Department of Education or made requests for academic support resources, someone would mention “ah, yes, you are Dulce’s family, the one from the documentary.” “I already had a face, eyes, expressiveness, a story,” he explains. “It wasn’t just a name in a dossier.” In the bureaucratic negotiation for resources and support, this minimal humanization of the file worked in their favor. It wasn’t … Read more

Pre-cragpt voice attendees are dead. And Apple is the only one that has not yet gone to burial

This week has happened unusual in the war of voice attendees: Amazon and Google have presented their new domestic devices the same day. Echo Dot Max, Echo Studio, Echo Show 8 and Echo Show 11 on the one hand. Google Nest Cam, Nest Doorbell, Google Speaker and the renewed Google Home app on the other. Why is it important. Calendar coincidence, yes, but above all shared emergency symptom. Because it is not just speakers with better serious or screens with more pixels. These are two technological giants, simultaneously burying their classic assistants – Alexa and Google Assistant – to replace them with conversational versions driven by large language models. The context. The rules changed in November 2022. Since Chatgpt demonstrated what a real conversation means with iausers have learned that there is something better than asking a speaker to put music or activate a timer. These basic functions have become invisible. The microwave of technology: it is there, it works, nobody thinks of it. What has happened. Alexa+ and Gemini for Home promise natural conversations, complex automation created in voice, cameras that understand context and not only detect movement. “Amazon has left a package at the entrance” instead of “activity has been detected.” Context, no events. Intention, not rigid commands. But that future is priced: ten dollars a month in the basic plan, twenty in the complete. The hardware is only the entrance door to a recurring income model. Between the lines. The inference costs of language models are huge and someone has to pay them. The AI ​​becomes the new market segmentator: with it your product is Premium, without it it is a relic of the past. These presentations arrive after years of stagnation. Alexa and Google Assistant had been doing the same thing: Music, alarms, time questions. The narrative has exhausted. Users have stopped imagining what else could these cylinders do and we have silently assumed that they are for what they are, in the same way we do not ask the oven that tells us jokes, because it is for something else. The big question. Who will pay twenty dollars a month to use a speaker that mainly serves to ask time and reproduce Playlists? The problem is no longer technical. It is perception. Amazon and Google need us to see them otherwise, to set a new habit. Meanwhile, Apple continues with the classic Siri. Limited, frustrating, stuck. Has integrated chatgpt as a crutch for certain consultations, but His real advance has stuck and delayed. And above all, it is created for iPhone, iPad and Mac … but not for Homepod. At least for now. In summary. Expectations have already changed. Anyone who has used Chatgpt knows what a real conversation with AI means. Back to Siri after that feels like going back fifteen years. Pre-cragpt voice attendees are dead. Amazon and Google have certified their death this week. They have gone to the burial and have begun to build another tool. The question is no longer whether Apple will present your answer. It is yes, when you finally do it, someone will continue to wait for it. Or if we will be to other things. Outstanding image | Amazon, Google, Xataka In Xataka | For years TV spoke to us. Now AI wants us to talk to him

The voice recorders seemed dead. AI and new hardware are causing them to be irresistible again

There was a time when voice recorders were essential for journalists, students and professionals who needed to register conversations. With the rise of the smartphone, They were relegated to a drawer corner. Today, artificial intelligence has returned them to the scene: compact and connected devices offer automatic transcripts and summary summaries in seconds. What seemed like a dead category returns to the attention of users and manufacturers, with proposals that modernize a classic tool. The Startup Plaud, based in San Francisco and Shenzhen, has found the key to reinvent a classic device. His Notepinwith format similar to a pendrive, allows you to record conversations and turn them into ordered transcripts and automatic summaries. The recorder connects to an application that offers intelligent searches and answers to questions about recorded content. Plaud thus bets on joining minimalist design and software to differentiate themselves from the basic functions offered by mobiles. From recorder to notes with ia Plaud has managed to turn an idea of ​​niche into a profitable business. Since its launch in 2023, the company has sold more than one million devices, According to Forbes. Its model combines hardware and subscription: the note Cuesta 169.90 euroswhile other proposals such as the Note and the Note Pro reach 169.90 and 189 euros, respectively. With this formula, the startup plans to reach about 250 million dollars of annualized income and boasts of margins close to 25%, comparable to those of the iPhone. Plaud does not arise in a vacuum: hardware with AI lives a moment of effervescence. The aforementioned means estimates that the sector has received more than 350 million dollars in recent investment. Amazon has also joined the movement acquiring Beea startup that bet on compact recorders for executives. The idea of ​​carrying an assistant always seduces investors, but the results do not always accompany: some projects have become warnings for the entire sector. Rabbit is a clear example of those unfulfilled promises. His R1 It was announced as the future of interaction with AI, but The initial emotion gave way to disappointment When users verified that Its functions were practically those of a mobile app. Humane went further with his Ai pina futuristic device that sought to replace the phone, But that ended up being an expensive failure. Faced with these stumbling blocks, Plaud has earned a hole focusing on real productivity: record, transcribe and organize information without impossible distractions or ambitions. Dingtalk China is also betting strongly on this category. South China Morning Post details That Dingtalk, the Alibaba business collaboration platform, presented in August the A1, a recorder with the compact size capable of transcribing, summarizing and translating conversations into more than 100 languages. The device is based on the Tongyi AI laboratory, trained with more than 100 million audio hours and specialized in 200 sectors. With prices from 499 yuan (about 60 euros to change), it is presented as a more affordable alternative to the plan of Plaud, which costs 169.90 euros, although it is not available outside China. The big question is evident: if the mobile can record, why load with another device? Plaud has found its space by focusing on functions that the phone does not offer with the same effectiveness. Your recorders incorporate dedicated microphones and extended autonomyideal for long days of meetings or interviews. The application includes specific templates for doctors, lawyers or commercials, which simplifies the workflow. This practical approach makes note to something more than a simple engraver: it is a tool designed for those who depend on registering information without interruptions. PLAUD PRODUCTS None of this is free. Plaud offers three plans: one basic, without cost, with limited functions, and two payment that unlock the entire potential of the device. The Pro Plan, which costs 110.99 euros a year, allows 1,200 minutes of transcription per month, more advanced templates and personalized summaries. The unlimited plan rises to 249.99 euros a year and offers continuous recording and transcription, in addition to all the functions of the platform. This structure reinforces the hybrid business model: attractive hardware and a subscription that converts the device into a complete service. Recording conversations is no longer an exclusive practice of journalists. Nathan Xu, plaud co -founder, believes that the device is conceived as a professional tool and Not like a spy device. To reinforce that idea, the note includes a state light that warns when recording. In the case of the United States, in some places such as California record without permission, it can carry fines or even prison sentences, although regulations are rarely applied. The ethical debate about carrying a microphone always on is still open. Plaud was born in Shenzhen, but Xu wanted to strengthen his identity as an American company. The firm is registered in Delaware, is based in San Francisco. An important point, At least according to the official website for the Spanish marketis that the service stores the data of its users on servers located in the United States. This strategy, apparently, seeks to dissipate suspicions in a context of growing tensions between Washington and Beijing in terms of privacy. The future of these recorders will depend on several factors. Plaud has already begun to explore sectors such as Health, where he acquired a hospital software startup to reinforce its position against competitors such as Open or Nuance, owned by Microsoft. This highly regulated market requires precision and security, which can favor specialized companies If they manage to gain user confidence. The return of voice recorders is not a simple fashion. Plaud has shown that the public is willing to pay for tools that optimize their time, even in an era dominated by the smartphone. With rivals such as Alibaba reinforcing its bet, competition intensifies. These solutions must prove that they are not only a bridge to mobile functions, but their own category. What seems clear is that recording and processing precision had never had so much potential. Images | Plaud | Dingtalk In Xataka | 100 million Tamagotchis … Read more

A VHS tape, an AI and eight seconds of audio. That is all a woman needed to recover her lost voice

At some point from the 90s to a young Sarah Ezekiel they recorded her with a video camera and kept that recording In a VHS tape. In that little Sarah appearance he only spoke eight seconds, but almost three decades later those eight seconds have ended up being an incredible gift. One that He has returned his voice. Sara Ezekiel lost her voice in 2000, before smartphones became massive and allowed us, among many other things, to capture video easily. A motoneurone disease caused that both capacity and mobility in his hands lost just when he was going to have his second child, a child called Eric. As indicated in a BBC interview“I thought it would be fine, but after Eric was born, I quickly deteriorated.” In a few months he lost control of his hands, and shortly after he was incapable of any type of intelligible conversation. His marriage ended shortly after, and Sarah, in the care of his two children, found himself in a terrible situation. Eric, now 25 years old, just remembers his mother being paralyzed. Aviva, her 28 -year -old daughter, remembers when she realized that her mother was different. “I have that memory of asking him to prepare some strawberries and see that he was not able to cut them. He had to ask someone.” Five years after diagnosis, Sarah found A break Thanks to the Ocular monitoring technology. He could build words and phrases with the movement of the eyes, so that a voice synthesizer offered a synthetic voice similar to what for example I used Stephen Hawking. That was the beginning of a new life that adapted with joy. He became a volunteer of the association to affected by his illness, and painted again thanks to that same technology. And a few years later a small milestone began to take shape. The AI ​​and the “miracle” of the cloning of the voice A company called Smartbox He had announced what was going to provide cloned voices Free for a million people at risk of losing their voice or those who had already lost it for diseases such as cancer or motorcycle. They asked Sarah a voice recording to rebuild her, but the only thing Sarah had was an old VHS tape in which they had recorded her daughter Aviva and in which she spoke just eight seconds. That recording was a disaster. Image quality was bad, but In addition the sound was distortedwith people mixing when speaking and a television sound at full volume. Simon Poole, one of those responsible for Smartbox, thought it would not be possible. However, Poole managed to isolate Sarah’s voice thanks to the Voice Isolator application of Elevenlabs, Specialized company in This type of solutions. The problem is that the result lacked intonation and personality, and also had a strange American accent. To try to solve it, he used another application trained with thousands of voices to fill holes of this type of recordings and that could help recover a voice like Sarah’s. After achieving a result that he believed appropriate, he sent several phrases to Sarah with his cloned voice. He called and heard her how Sarah, hearing those phrases, almost broke out to cry. One of Sarah’s old friends, who met her before losing his voice, was “impressed by how realistic he sounded.” His daughter Aviva said he was also impressed although he admitted that he had to get used to her. “Hearing her now daily still surprises me.” For Eric “it has made an incredible difference”, because that voice can also include an intonation that shows that your mother is cheerful or angry. Sarah misses her authentic voice, but as she said, “I’m happy to be back. It’s better than being a robot“ Image | Gabriel Petry | Ursula Castillo In Xataka | The AI ​​was already able to clone voices and faces. Now also clon our way of writing

Xiaomi has launched a voice of voice. It is not for mobiles, it is for the war of cars

Xiaomi was one of the first brands to announce a Voice assistant with ia For its mobile phones, although it is little known because it only works in China. Seven years later, the Asian giant has announced a new model of voice, but this time its approach is not in mobile phones, but in cars and the connected home. MIDASHENGLM-7B. Xiaomi has baptized his new model with this name so unattractive and difficult to remember. It is composed of two key parts: the audio encoder with the dasheng and the decoder of ALIBABA QWEN2.5-OMNI-7B. Combined, the system is able to recognize not only our voice, also environmental sounds, music and background noise. Xiaomi presumes that offers a “first level performance in 22 public benchmarks.” Specifically, he has surpassed OpenAI whisper in nonverbal audio understanding tasks. In the car. Xiaomi has already found 30 applications of its new voice model for different products. Voice control includes the acoustic environment and responds according to the context, for example if there is an unusual sound in the car. They have also thought of a function that helps us improve pronunciation to learn languages while driving and the possibility of “waking” the car with the voice even before we enter the cabin. Its ability to detect anomalous sounds also makes it useful from the point of view of security, both of the car with a more powerful anti -theft mode, and at home through smart speakers. At home. The new model enables the activation of intelligent functions through sounds, such as that the lights are linked when applauding or the air conditioning goes out when we go out the door, without having to ask with the voice. Xiaomi ensures that its system has a very low latency and a large parallel processing capacity, allowing it to function in devices with few resources, such as cameras or speakers, and maintain good performance in environments with many devices connected as a house. Open source. China has chosen the Open-Source side in the AI race And the new Xiaomi voice model follows the line. Midashengglm-7b is open source and operates under Apache 2.0 license, which allows commercial use and free modification. That opens the door to its use by other developers and also in the academic field. In addition, Xiaomi has made public all the data that has been used for model training. The objective is clear: attracting the developer community and that its audio ecosystem becomes standard, strengthening its competitive position. Conversational experience. The automobile industry is taking a turn in which The software is positioned as the differentiating factor number one. It is no longer only if it runs more or how comfortable it is to drive, they are the autonomous driving systems, the interface of its screen and especially voice control. According to This studyvoice attendees with AI will be a standard in cars by 2033. Who has the best conversational experience will have a clear advantage and Xiaomi has taken an important step in this direction. Cover image | Xataka with Zky.icon icons In Xataka | Xiaomi continues to lose money with their electric cars … but they are being their greatest success

How to generate one in voice artificial intelligence from a text: Create your personalized announcer

Let’s tell you How to generate a voice with AIto be able to tell things from a text you write to him. This will allow you to have a personalized announcer or broadcaster to create your own podcasts or audiobooks, and to read you anything you want. First, we are going to give you a series of previous tips for you to know what you should have in mind to use these tools for artificial intelligence. Then we will tell you four free tools you can use To move from text to voice and use voices by AI to create audios. Before starting, some tips Before put to create locutions with a voice made by artificial intelligence you must think about What do you want to get with her. Think about whether you want it to be a natural voice or a robotic narrative, and also in the language, the tone or accent you want to use. It is very important have written a well structured text so that then the voice of the say it. Use the punctuation marks well so that the intonation sounds natural, and if it can be, divide the text into short phrases to also improve naturalness. Tests before getting seriouslyUse some short phrases to test your voice and see if you are using a tool or configuration that convenient you. And before that, Listen to several voice examples If the tool has several available to better choose the one you want to use. Finally, be aware of the limitations of the platform you use. Most are paying, and In their free ways they have limits by number of characters or minutes of audio generated. Therefore, if you see it necessary, you may have to divide a long text into different fragments instead of trying to make it whole. Generates phrases with notebooklm The first tool you can use is Notebooklmwhich is free and from Google. In this case, You can only make audio summaries Of one or more issues that you upload on the web, you will not be able to write what you want me to say the voice. But it is totally free. To use it you have to enter Notebooklm.google.comand on the main page click on the option of Create notebook. It is also available In Google Play For Android and In the App Store For the iPhone. Notebooklm is divided into notebooks, which are work spaces. The first thing you will have to do, therefore, is to create a new one. Once inside, You will have three columns or sections. On the left you have the fountains. Here, You can add one or more sources. Everything you upload will be what analyzes Google’s artificial intelligence when you ask you questions or ask for content. They can be text documents, slides, PDF, YouTube videos or links to web pages or online items. Then you have the section of Chatwhich is where you will be able to ask Google’s AI all the questions you want and get answers based on the sources. And then you have the section of Studio. In it, you will be able to create an audio with a summary of the sources provided in this project. If you click on the Customize Audio Summary buttonthen you will be able to determine how the summary that your voice will make, giving indications on the subject to speak or the source to focus, and you can also have the way this voice speaks. Elevenlabs Elevenlabs is a platform with several artificial intelligence resources, including one for AUDIO TEXT. You can find it on the web Elevenlabs.io/es/text-to-speechand although you can try it without registering you will need to create an account to download the audio you generate. It is very easy to use. First you write or fight the text you want to narrate, and then you will have to Choose the voice and language What do you want to use, being able to determine the model to be used and the speed at which it is discussed. There are voices in Spanish from Spain, and also Latinas. Then you press in Play And ready, reproduce. Here, the bad news is that The free account has limit In the number of characters to process every month. Specifically, you can create 10 minutes monthly of high quality text -based audio. TTSMaker This is another tool to move from text to voice with artificial intelligence, which is characteristic by Do not need to create account To use it for free. You just have to write the text, choose the language and ready, generate what you have written in voice. You can use it from the web TTSMaker.com. Here, you should know that each audio has a maximum of 1,000 characters, and that you can Use 20,000 characters a week being a free user. For more, you will have to pay. They are very generous margins, although the web has enough advertising. You also have options how to choose the audio format you want to generate, or listen only to the first 50 characters before creating the audio to make sure you are to your liking. You can also configure the speed, volume, audio quality, the length of each pause. Many things. Clipchamp This is a Microsoft tool to create videos, including the option to do so for the. However, You have an option of Text to voice When creating, with which you can write a text you want and choose the voice to use. You will need to enter the website clipchamp.com and log in with your Microsoft account (Hotmail or Outlook). Once the text option is chosen to voice, You will have the creation options in the right column. There, you can hit the text you want, choose the language and voice. In addition, there are advanced options to choose the voice passage and the rhythm you use to read what you write. And ready, with this, what you write will … Read more

The human voice loses ground and automatic dubbing with AI makes its way between creators

Artificial intelligence (AI) continues to gain space in unthinkable sectors just a few years ago. What began revolutionizing software development is now leaving a mark on very different fields. The phenomenon of ‘Vibe Coding‘, a new way of programming based on Prompts In natural language, without writing code as such, the rules of development are changing. At the same time, tools such as Dall · e of OpenAi or Adobe Firefly They are redefining visual creation, allowing generating images and videos from textual descriptions. And now, AI also reaches voice professionals. For a few months, YouTube allows some creators to use automatic dubbing through AI. This function, which is being deployed progressively, is now available for all members of the YouTube Partner Programa system that allows creators to monetize their contents on the platform. Thanks to this tool, videos can reach global audiences more easily, as an alternative to traditional dubbing and postproduction processes. An activated function by default. One of the most striking changes is that this function is activated by default: if you upload an English video, YouTube will automatically double it in several languages, including Spanish, French, German, Hindi, Indonesian, Italian, Japanese and Portuguese. If the video is in any of those languages, an English version will be generated. The objective, according to the company, is to “break the barriers of the language.” But if the creator prefers not to use this function, it can deactivate it following concrete steps: enter YouTube Studio from the computer and follow this route: Configuration > Subsequent adjustments > Advanced configuration. There you have to uncheck the ‘allow automatic dubbing’ box and click on store. The management of this function, in addition, can only be done from YouTube Studio on a computer, and allows you to even manually review dubbing before publishing them if you want greater control of the result. YouTube admits that technology still has improvement margin. Translations may not be exact and the generated voice may not faithfully represent the author. In fact, the company itself recognizes that intonation, emotional tone and certain cultural nuances do not always move correctly. The basis of this function is in the technology of Google Deepmind And Google Translate, but even with that background infrastructure, not all videos can be folded successfully. Factors such as the original accent, the background noise, the use of jargon or even proper names can affect the quality of dubbing. A video with automatic YouTube dubbing To know if a video is bent with AI, it is enough to look for the “auto-dubbed” label. It is also possible to change audio track from the player’s menu, through the gear icon in the lower right corner. The language adapts to that the user has configured as preferential, which makes the experience, in many cases, more transparent. We can see the active function in this example video. The prelude: real dubbing with professional actors. YouTube had already taken previous steps in this direction. With the function “Add audio track“, Some creators like Mrbeast They began to offer versions of their videos in several languages using professional voice actors. He even counted With the dubbing actress that plays Naruto in his videos in Japanese. This strategy, focused on expanding the scope of the content without losing interpretive quality, allowed many of these videos to reach new audiences. Mrbeast has not been the only. According to YouTube datathe videos that offered versions in several languages ​​made more than 15% of the visualization time come from audiences that saw the content in a language other than the original. In January 2023, users came to consume more than 2 million hours of folded content every day. And that only with the initial tests. The extension of this function to all members of the Partner Program is contributing to the growth of the use of voices generated by artificial intelligence on the platform. This raises questions about how this technology will be integrated into the professional dubbing ecosystem. Some creators could opt for this automatic alternative in certain types of content, especially in informative or structured pieces. However, in other formats where vocal interpretation provides key nuances, the role of dubbing professionals remains a difficult element to replace. Development continues, and the balance between technology and creativity is still evolving. Images | Lorenzi | YouTube screen capture In Xataka | Two students created an AI to “cheat” in work interviews: they have rewarded them with 5.3 million dollars

Openai’s new voice models already speak as customer service agents. His next destination: the call centers

Since the beginning of the year, the objective of great technological ones has been clear: that we talk to artificial intelligence (ia). Openai, Microsoft, Google and Meta have added voice functions to their assistants. But this seems to be just the beginning. The industry advances at a frantic pace and the way we interact with these tools continues to evolve. Tell the voice agents ‘hello’. Sam Altman’s company has been betting on text agents with tools such as Operator either Computer-Useing agents. However, Openai already has it ready if next great movement to continue highlighting in the race for the development of AI: to promote a new and powerful generation of voice agents. New models on stage. OpenAI has announced The launch of new audio models to turn voice into text and vice versa. They are not in chatgpt, but in the APIwhere developers can use them to create voice agents. The important thing? They aim to be much more precise and to bring customization to the next level. The new OpenAI models, built on GPT-4O and GPT-4O-minipromise to improve Whisper Already its previous text to voice tools, which will also remain active through the API. But it is not just a matter of performance: now they can also modulate their tone to sound, for example, “as an empathic customer service agent.” Destination: the call centers. Openai makes it clear where they point with this launch. He assures that “for the first time, developers can tell the model not only to say, but also how to say it, which allows more personalized experiences for use cases ranging from customer service to creative narrative.” According to Openai, this technology will allow creating much richer “conversational experiences.” If we take into account that Chatgptpowered by GPT-3.5arrived in November 2022, it is evident that the progress has been vertiginous. And everything indicates that these models will end up arriving at the call centers. We might think that at first the interactions will be somewhat limited, but well above the current voice systems. They will move away from traditional automated assistants and will be much more natural. Over time, the line between a conversation with a person and an AI could become almost imperceptible. Images | Charanjeet Dhiman | OpenAI In Xataka | We have tried Sesame’s conversational. It is the experience closest to a “human voice” that we have seen In Xataka | China has found an unusual strategy to avoid US mosquadillas with AI: bet on the Open Source

Log In

Forgot password?

Forgot password?

Enter your account data and we will send you a link to reset your password.

Your password reset link appears to be invalid or expired.

Log in

Privacy Policy

Add to Collection

No Collections

Here you'll find all collections you've created before.