Google’s secret weapon against CUDA dominance is called TorchTPU. And it’s an NVIDIA waterline missile

Google has launched an internal initiative called “TorchTPU” with a singular goal: to make their TPUs fully compatible with PyTorch. For the not so initiated, we translate it: what Google intends is to destroy once and for all the monopoly and absolute control that NVIDIA has with CUDA. Why is it important. NVIDIA has become the first company in the world by market capitalization for two big reasons. The first, for its AI GPUs. And the second, much more important, for CUDAthe software platform that is used by all AI developers and that has an important peculiarity: it only works on chips from NVIDIA itself. So if you want to work in AI with the latest of the latest, you have to jump through hoops… until now. What happens with Google and its TPUs. Google’s Tensor Processing Units (TPUs) were until now optimized for Jax, Google’s own platform that was similar to CUDA in its objective. However, the majority of the industry uses PyTorch, which has been optimized for years thanks to the aforementioned CUDA. That creates a barrier to entry for other chipmakers, which face a huge bottleneck in attracting customers. Goal is in the garlic. Anonymous sources close to the project indicate in Reuters that to achieve its goal and accelerate the process Google has partnered with Meta. This is especially striking because it was Meta who originally created PyTorch. Mark Zuckerberg’s company has ended up being just as much a slave to NVIDIA as its rivals, and is very interested in Google’s TPUs offering a viable alternative to reduce its own infrastructure costs. Google as a potential AI chip giant. The company led by Sundar Pichai has made an important change of direction with its TPUs, which were previously reserved exclusively for it. Since 2022, the Google Cloud division has taken control of their sale, and has turned them into a fundamental revenue driver because they are no longer only used by Google: Tell Anthropic. A spokesperson for this division has not commented specifically on the project, but confirmed to Reuters that this type of initiative would provide customers with the ability to choose. All against NVIDIA. This alliance is the last attempt to put an end to that great ace in NVIDIA’s sleeve. In these months we have seen how companies like Huawei prepare your own alternative ecosystem to CUDAbut they also participate in a joint effort of several Chinese AI companies for the same purpose. Hardware matters, software matters more. CUDA has become such a critical component for NVIDIA that if other semiconductor manufacturers have not been able to compete with it, it is not because of their chips, but because they cannot support CUDA natively. We have a great example in AMDwhich has exceptional AI GPUs. In fact, they are superior to NVIDIA in certain sections, but their software is not as powerful. In Xataka | Google’s TPUs are the first big sign that NVIDIA’s empire is faltering

Huawei is building its own alternative ecosystem to CUDA. If it succeeds, NVIDIA will have a serious problem

When talking about NVIDIA, almost all the focus is on the hardware: the H100Blackwell, racksenergy consumption, nanometers… It is understandable, but it is a mistake. The defensive moat – the moat– NVIDIA is not the hardware. Is CUDA. CUDA is not an add-on to the chip, it is the de facto standard upon which most of the AI ​​code on the planet is written, optimized and debugged. Changing GPUs without changing CUDA does not exist. And switching from CUDA means rewriting years of work. That is why it is a moat. Why is it important. Huawei’s big bet is not to “make a Chinese H100.” It is to build a path for the developer to reach Ascend without feeling like you are changing planets. The restrictions are accelerating it. Exports have split the world in two: An ecosystem that revolves around NVIDIA. And another that China is trying to lift against the clock. In that second, Huawei is not just playing chips: is playing “ecosystem”in AI and outside of it. And therein lies the nuance: you can be years behind in chips and still reduce dependency if you get the software to swallow. In detail. Huawei is attacking the problem on three fronts, with a pragmatically Chinese logic: not to replace everything at once, but to open shortcuts. Native stack (CANN + MindSpore). It is your “pure” alternative: your own environment and your own tools to get the most out of Ascend. The cost today is high, there are complaints of instability, the documentation is rather messy, and the community is much smaller. PyTorch support. This is the most strategic move. Huawei does not try to make the world love its framework– Try to ensure that the world doesn’t have to leave PyTorch. torch_npu acts as an adapter to run PyTorch models on Ascend, but with one problem: it is not native and suffers with every PyTorch change. If PyTorch advances and your backend lags behind, the developer notices. Portability via ONNX. Here Huawei looks for its best window: inference and deployment, not training. ONNX works as a bridge format: you train where you can (often NVIDIA) and deploy to Ascend. It’s a less romantic and more useful approach: if shortages hit, moving inference to local hardware is an immediate relief. Between the lines. The real story is that Huawei is trying to replicate the “trick” that made NVIDIA great: turning its hardware into an experience. That’s why the tactic that explains everything appears: putting engineers in the client’s home to migrate code and optimize it. It is not scalable as a business model, but it is scalable as a transition model: you buy time while you mature tools, libraries and support. And there is another derivative: if China gets enough teams to adopt Ascend out of necessity, over time that can become habit and then infrastructure. Not because it is better, but because it is already integrated. Yes, but. Huawei has two limits that cannot be fixed with marketing: Hardware improvement rate: Roadmap analysis suggests relative stagnation and a gap that could widen, not close, if NVIDIA continues to accelerate cycles. Off-chip bottlenecks: memory (HBM), tools and industrial capacity. You can add “worse” chips, but you need to make a lot of them and build a lot of systems. And now what. If this movie continues, we will see two clear signs: Less hype of chips and more real migration stories: how many computers have moved to Ascendwith what frictions, with what performance losses. Less obsession with training in Ascend and more normalization of the hybrid pattern: I train where I can, I deploy where I must. NVIDIA will continue to be CUDA. Huawei is not “a chip.” It is an escape strategy. And the restrictions are the fuel that is making it inevitable. In Xataka | With HarmonyOS NEXT Huawei has achieved something incredible. Neither Samsung, Microsoft nor Mozilla achieved it Featured image | NVIDIA, Huawei

A quarter of a century ago a student put together 32 GeForce graphics cards to play Quake III. CUDA came from there

In the year 2000 Ian Buck wanted to do something that seemed impossible: play Quake III in 8K resolution. Young Buck was studying computer science at Stanford, specializing in computer graphics, and then a crazy idea occurred to him: put together 32 GeForce graphics cards and render Quake III on eight strategically placed projectors. “That,” he explained years later, “was beautiful.” Buck told that story in ‘The Thining Machine’, the essay published by Stephen Witt in 2025 that traces the history of NVIDIA. And of course one of the fundamental parts of that story is the origin of CUDA, the architecture that AI developers have turned into a gem and that has allowed the company to boost and become the most important in the world by market capitalization. And it all started with Quake III. The GPU as a home supercomputer That, of course, was just a fun experiment, but for Buck it was a revelation, because there he discovered that perhaps specialized graphics chips (GPUs) could do more than draw triangles and render Quake frames. In 2006 the GeForce 8800 GTS (and its higher version, the GTX) began the CUDA era. To find out, he delved into the technical aspects of NVIDIA graphics processors and began researching their possibilities as part of his Stanford PhD. He gathered a small group of researchers and, with a grant from DARPA (Defense Advanced Research Projects Agency), began working on an open source programming language that he called Brook. That language allowed something amazing: making graphics cards become home supercomputers. Buck demonstrated that GPUs, theoretically dedicated to working with graphics, could solve general-purpose problems, and also do so by taking advantage of the parallelism offered by those chips. Thus, while one part of the chip illuminated triangle A, another was already rasterizing triangle B and another writing triangle C in memory. It wasn’t exactly the same as today’s data parallelism, but it still offered amazing computing power, far superior to any CPU of the time. That specialized language ended up becoming a paper called ‘Brook for GPUs: stream computing on graphics hardware‘. Suddenly parallel computing was available to anyone, and although that project barely received public coverage, it became something that one person knew was important. That person was Jensen Huang. Shortly after publishing that study, the founder of NVIDIA met with Buck and signed him on the spot. He realized that this capacity of graphics processors could and should be exploited, and began to dedicate more and more resources to it. CUDA is born When Silicon Graphics collapsed in 2005 – due to NVIDIA that was intractable in workstations – many of its employees ended up working for the company. 1,200 of them in fact went directly to the R&D division, and one of the big projects of that division was precisely to take forward this capacity of these cards. John Nickolls / Ian Buck. As soon as he arrived at NVIDIA, Ian Buck began working with John Nickolswho before working for the firm had tried—unsuccessfully—to get ahead of the future with his commitment to parallel computing. That attempt failed, but together with Buck and some other engineers he launched a project to which NVIDIA preferred to give a somewhat confusing name. He called it Compute Unified Domain Architecture. CUDA was born. Work on CUDA progressed rapidly and NVIDIA released the first version of this technology in November 2006. That software was free, but it was only compatible with NVIDIA hardware. And as often happens with many revolutions, CUDA took a while to gel. In 2007 the software platform was downloaded 13,000 times: the hundreds of millions of NVIDIA graphics users only wanted them for gaming, and it remained that way for a long time. Programming to take advantage of CUDA was difficult, and Those first times were very difficult for this projectwhich consumed a lot of talent and finances at NVIDIA without seeing any real benefits. In fact, the first uses of CUDA had nothing to do with artificial intelligence because artificial intelligence was barely talked about at the time. Those who took advantage of this technology were scientific departments, and only years later would the revolution that this technology could cause take shape. A late (but deserved) success In fact, Buck himself pointed this out in a 2012 interview with Tom’s Hardware in 2012. When the interviewer asked him what future uses he saw for the GPGPU technology offered by CUDA in the future, he gave some examples. He talked about companies that were using CUDA to design next-generation clothes or cars, but he added something important: “In the future, we will continue to see opportunities in personal media, such as sorting and searching photos based on image content, i.e. faces, location, etc., which is a very computationally intensive operation.” Here Buck knew what he was talking about, although he did not imagine that this would be the beginning of the true CUDA revolution. In 2012 two young doctoral students named Alex Krizhevsky and Ilya Sutskever They developed a project under the guidance of their supervisor, Geoffrey Hinton. The Nvidia Way: Jensen Huang and the Making of a Tech Giant (English Edition) The price could vary. We earn commission from these links That project was none other than AlexNetthe software that allowed images to be classified automatically and which until then had been a useless challenge due to the cost of the computing it required. It was then that these academics trained a neural network with NVIDIA graphics cards and CUDA software. Suddenly AI and CUDA were starting to make sense. The rest, as they say, it’s history. In Xataka | We can forget about AI without hallucinations for now. NVIDIA CEO explains why

CUDA is the standard that grips the world and Nvidia is the only company with chips capable of running it. Until now

Goal will acquire rivos, a Californian startup specialized in the design of chips based on RISC-Vaccording to sources of Bloomberg. In addition to the capabilities of its chips, the operation is part of a broader strategy: free itself from the NVIDIA dependence and thus take control of its infrastructure for artificial intelligence without its chips. What is at stake. Throughout these last years, Nvidia has dominated the GPUS market For the thanks to CUDAits owner development platform that has become the de facto standard to train and execute artificial intelligence models. Today, we have reached the point that whoever wants to make a large scale needs Nvidia chips, and that gives the company a huge market power, since they put the necessary hardware for an industry in which everyone wants to enter. Goal, despite having some of the best open models in the sector with Callskeep spending billions annually in Nvidia hardware. The strategic movement. With rivos, goal not only buys a company, buy an alternative to the current technological stack. The startup Develop GPUS and RISC-V-based acceleratorsan open source architecture standard that threatens the traditional X86 (Intel and AMD) and ARM. Goal already works in its own internal chip, the goal Training and Inference Accelerator (Mtia), designed next to Broadcom and manufactured by TSMC, but the advances are not as fast as Zuckerberg would like. According to sources cited by Bloombergthe CEO would have been actively looking for market reinforcements to accelerate development. It is not the only one. Goal adds to a career in which their technological rivals already have an advantage. Google has His tpusAmazon has Trainium and Microsoft has developed Maia. The AI ​​war does not win only with the best models, but also With the chip that executes them And goal, despite being burning hundreds of billions of dollars in AI, it was staying behind in this front. The context. Rivas acquisition is not an isolated movement. Target there was already tried to buy furiosaaia South Korean startup specialized in chips to train AI systems, but the offer of 800 million dollars was rejected. In addition, the company has recently announced An investment of 29,000 million dollars To build a huge data center in Louisiana and plan to spend up to 72,000 million this year on infrastructure related to AI. The RISC-V challenge. Rivas represents an ambitious bet. Although RISC-V has not yet managed to penetrate massively into US data centers (its presence is mainly limited to microcontrollers and IoT devices), its potential is undeniable. China is already launching tablets and laptops with this architecture. If Meta manages to develop an AI accelerator based on RISC-V capable of replacing The NVIDIA H200 In its internal operations, it would be a considerable blow for the dominant standard. Cover image | Nvidia and Goal In Xataka | Openai has just presented Sora 2 with a Tiktok -style app. This is outlined a new wave of viral videos

China urgently needs an alternative to Cuda de Nvidia

Chinese companies that are dedicated to artificial intelligence (AI) They already have at their disposal a wide range of GPUs designed and manufactured in China. Huawei is the best positioned company thanks to its Chips Ascend 910 family, but in the country led by Xi Jinping There are many other specialized companies in the tuning of hardware for which they also have great potential. Metax, Biren Technology, Moore ThreadsINNOSILICON, ZHAOXIN, ILUVATAR COREX, DEGLINAI OR VAST AI TECH ARE SOME OF THE MOST IMPORTANT. “China is dedicating mass resources to the implementation of emerging companies specialized in the development of GPU. Do not underestimate them.” This warning was probably not ignored. He was aimed at the US government and came from someone who knows well what he says: Jensen Huang. The general director of Nvidia He pronounced these words Last year, during the celebration of Computex, and it is evident that its intention was to prevent US administration about the consequences that They would have the sanctions that seek to stop the technological development of China. Hardware is no longer the problem; It is CUDA Li Guojie, a computer scientist from the Chinese Academy of Sciences that is considered an authority in China, has confirmed What we have just seen: GPUs for the most advanced Chinese, such as Huawei Ascend 910 chips that I mentioned above lines are comparable to current NVIDIA solutions in terms of calculation capacity. However, the authentic strength of the company led by Jensen Huang is not its hardware; is its ecosystem CUDA (Compute Unified Device Architecture). “Deepseek has had an impact on the CUDA ecosystem, but has not completely overcome it because barriers persist” Most of the AI ​​projects that are currently being developed are implemented on CUDA. This technology brings together the compiler and development tools used by programmers to develop their software for NVIDIA GPUs, and replace it with another option in the projects that are already underway It is a problem. Huawei, who aspires to an important portion From this market in China, it has Cann (Compute Architecture for Neural Networks), which is its alternative to CUDA, but for the moment CUDA dominates the market. The most surprising thing is that not only Chinese companies need to part with this software; The entire AI industry wants to end Cuda. This is at least what Pat Gelsinger saidthe former director general of Intel. In December 2023, this executive wet and explained what was the official position of his company in the context of the AI ​​sector. “The whole industry is determined to eliminate market CUDA (…) We see it as a shallow and small pit (…)”, Gelsiter defended in the framework of the event “Ai EveryWhere” held in New York. According to Li Guojie“China must develop an alternative system to achieve self -sufficiency in AI (…) Deepseek has had an impact on the CUDA ecosystem, but it has not completely overcome it because barriers persist. In the long term we need to establish a set of software tool systems for controllable that exceed Cuda“. This is undoubtedly one of the great challenges that China faces in this area. Perhaps Cann manages to break down gradually and manages to finally impose himself, but while he persists to some extent the development of AI in China will continue to depend on the US. Image | Moore Threads More information | SCMP In Xataka | We can forget an AI without hallucinations for now. The general director of Nvidia explains why

Log In

Forgot password?

Forgot password?

Enter your account data and we will send you a link to reset your password.

Your password reset link appears to be invalid or expired.

Log in

Privacy Policy

Add to Collection

No Collections

Here you'll find all collections you've created before.